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PREFACE 


It gives me pleasure to thank B.M. Bolotovsky and S.N. Sto- 
lyarov who have written §§ 6.14, 6.15 of this book. I wish to ex- 
press my special gratitude to V.L. Ginzburg. This book quotes 
many things which I[ learned at the seminar led by him. A few 
questions were discussed with him directly; in particular, the 
problem of an energy-momentum-tension tensor should be men- 
tioned. Finally, V.L. Ginzburg has written the article “Who 
Developed the Special Theory of Relativity, and How?” to be 
published in this book (Supplement I). In my opinion, this article 
gives very precise answers to questions which would be met by 
anyone interested in the history of the STR evolution. I feel myself 
honoured to have this article included in the book. 


The author 
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CHAPTER [ 


CLASSICAL MECHANICS 
AND THE PRINCIPLE 
OF RELATIVITY 


§ 1.1. A coordinate system and a reference frame in classical 
mechanics. All natural phenomena happen in space and in the 
course of time, and an element of any phenomenon is something 
occurring at a given moment of time and at a given point in space. 
In the special theory of relativity * it is customary to refer to that 
“something” taking place at a given point and at a given moment 
of time (in fact, something concentrated in a sufficiently small 
volume of space and limited by a small time interval) as an event. 
This definition shows that concrete features of an event may be 
very different. That is why it is usual to indicate that “the event 
consists in ...”. The examples of events can be the emission of a 
light signal from a certain point in space at a certain moment of 
time, or the presence of a moving particle (a material point) at 
a given point in space and at a given moment of time. 

When an event is realized, one says that it “happened” (or is 
happening, or will happen). Any physical phenomenon represents 
a sequence of events. A description of a separate event serves as 
a basis for the description of any phenomenon and therefore we 
begin with the description of a separate event. 

To characterize a point in space where an event occurred, every 
point in space has to be labelled before specific physical pheno- 
Mena are analysed. But space is uniform and isotropic and this 
implies that all points in space and all directions in it are equal. 
It should be pointed out at once that we deal here with the free 
space, or vacuum. The investigation of physical phenomena in 
vacuo is of prime importance for the special theory of relativity. 
Even though vacuum is a complex physical system, it is sufficient 
for our purpose to assume that in the space domain which we 
take for vacuum, no substance possessing a finite rest mass is 
seamaae! present and gravitational and electric fields are not too 
strong. 

But even when all points in space are equal, one can still 
single out a certain point by placing a material object, ie. an 


* Hereinafter the complete term “special theory of relativity” will be some- 
dimes abbreviated as STR. 
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object having a finite rest mass, in it. Points in space are usually 
labelled by means of a coordinate system. With the help of the 
material object we distinguish a point which is the origin of co- 
ordinates The simplest coordinate system is the Cartesian sys- 
tem. Its construction begins with the tracing of three mutually 
perpendicular straight lines, i.e. the coordinate X, Y, Z axes. In 
terms of physics, however, these are not just abstract straight 
lines. Theoretically, the coordinate axes are rigid non-deformable * 
solids. By the way, instruments, standards and other objects of 
a given reference frame will be always fixed to them and there- 
fore it should be borne in mind that a physical coordinate system 
is always a material object. 

In the Cartesian coordinate system points are quite easy to 
label. From any point M in space one can construct the perpen- 
diculars to the X, Y, Z axes or, in other words, project this point 
on the coordinate axes. Having measured the distances of the 
point projections from the origin along the X, Y, Z axes by means 
of the chosen scale, we obtain the numbers x, y, z, which are called 
the Cartesian coordinates of the point. The distances can be meas- 
ured via the step-by-step transposilion of a unit scale along the 
axis from the origin to the point projection on the axis. In fact, 
such a procedure used for length measurements in everyday life 
can also be used for determining the length of a stretch or an 
object if it is at rest in a given coordinate system. As we shall 
sce later, the special theory of relativity furnishes a very con- 
venient method of measuring distances without recourse to rigid 
scales and their step-by-step transposition (see Chapter 2). Both 
methods are equivalent, of course. 

Thus through the introduction of the Cartesian coordinate s\s- 
tem every point in space acquires three numbers, that is the three 
Cartesian coordinates x, y, z. The principal objective of physics, 
however, is to study motion. Although mechanical motion is the 
simplest type of motion, its description requires time measure- 
ments and therefore the coordinate system has to be of necessily 
supplemented by a clock. This clock is needed to register the 
occurrence of events at various points in space. How many clocks 
are needed? 

In classical mechanics they do not usually hesitate over the 
answer to this question and tacitly assume that one clock resting 
in a given coordinate system is enough. It is useful to find out 
what this assumption implies. Let the clock be located at the 
origin of the coordinate system. Events may happen at any points 


* The STR negates the existence of absotute solids (see Chapter 8) but for 
the coordinate axes it is just sufficient not to be very elastic. 


Classical Mechanics and the Principle of Relativity 13 


in space including those removed far enough from the origin. 
Then how can the clock, removed from the place where an event 
happens, register that event? Obviously, just at the moment the 
event occurs a certain signal has to be sent from the place of 
occurrence of the event to the clock located at the origin. If the 
velocity of the signal is finite, it will reach the clock some time 
after the onset of the event, and the time lag will depend on the 
distance between the point where the event occurred and the clock. 
In classical mechanics, however, it is assumed that basically there 
may be signals propagating infinitely fast. It is obvious that in 
this case one clock tixed rigidly to any point in the coordinate 
system will be enough. 

It 1s implied that the onset of an event is registered as follows: 
at the moment of an event occurring at any point in space a 
signal is sent from that point to the clock, and the time of its 
arrival is thus the time of the onset of the event (the velocity of 
the signal is infinite!). The assumption concerning the infinitely 
fast signals applies, of course, not only to the registration of 
events. In Newtonian mechanics it is incorporated intrinsically: 
Ae a between bodies are transmitted infinitely fast (sce 

1.4). 

Modern physics, however, claims that all signals (interactions) 
are transmitted at a finite speed; in other words, there is a finite 
velocily of interaction transmission. How can this fact be recon- 
ciled with the evidence that Newtonian mechanics based on the 
assuinption about the infinitely fast signals copes excellently with 
many problems (for example, calculates superbly the motion of 
planets in the solar system)? The answer to this question is very 
simple. The ultimate speed at which a signal, or an interaction, is 
transmitted is very great. According to the conteinporary ideas it 
is the speed of electromagnetic waves in vacuo, which is equal to 
approximately 3-108 m/s. It follows that as far as velocities of 
objects to be considered are essentially less than that of light in 
vacuo and characteristic distances are such that the time of light 
propagation along them is negligibly small, Newtonian mechanics 
is correct and one clock is enough to register the time of events. 
Yet il is at once clear that in the case of a fast motion (v = c) 
and extended systems a time of an event has to be registered 
otherwise and the whole science of mechanics has to be based on 
different premises. In fact, this is just what the special theory of 
relativity does when it explicitly takes into account the finite 
velocity of the interaction transmission. 

Now let us get back to the classical pattern. A reference frame 
is formed by a reference object with a coordinate system, a set 
of length standards and a clock fixed rigidly to a reference object. 
In physics a reference frame is always implied since any 
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measurement taken by an instrument produces a result that is 
related to the reference frame in which this instrument is at rest. 

§ 1.2. The choice of a reference frame. To tackle concrete prob- 
lems, we choose a convenient reference frame and a convenient 
coordinate system. How does this opportunity of choice come 
about? As to a clock, in classical mechanics every reference frame 
needs only one ideal clock. But a reference object, an origin and 
directions of coordinate axes can be chosen at will. It is well 
known how this circumstance is utilized in geometry. For example, 
the equation of an ellipse has the simple form x?/a? + y?/b? = 1 
only if the origin is placed in the centre of the ellipse and the 
coordinate axes coincide with its principal axes. No doubt, all 
typical features of the ellipse remain for any other choice of the 
coordinate system, but all formulae become immeasurably more 
complex. It is important to point out here that in analytical geo- 
metry the transition from one coordinate system to another varies 
only the algebraic form of equations of geometric objects while 
the objects themselves naturally remain invariable. 

Considering physical phenomena, one may also set up a coor- 
dinate system rather arbitrarily. However, the two most signi- 
ficant properties of vacuum space are implicitly meant in this 
case: uniformity and isotropy. Uniformity is identity of all points 
in space. This property is very essential. Actually, it enables us 
to use physics. Laws of physics prove to be the same at various 
points of the Earth, and everywhere within the solar system, for 
that matter. But this is just what permits the origin to be placed 
at any convenient point. When we turn a coordinate system 
around the origin, we do not expect anything to change. This 
implies that all directions running from a given point are identi- 
cal in their properties. And this is exactly how isotropy of space is 
defined. In classical mechanics, or, more precisely, in reference 
frames where the Newtonian laws are valid (see § 1.5), uniformity 
and isotropy of free space are assumed. 

In contrast to geometry, in physics there is another choice of 
reference frames: one may consider those moving relative to one 
another. This is quite superfluous in geometry. But in physics the 
reference frames moving relative to one another are the inevitable 
occurrence. For example, physical experiments can be carried out 
abord a spaceship and on the Earth. These are the two reference 
frames, each of which may have instruments motionless relative 
to the frame. As soon as we accept the reference frames moving 
relative to one another, the two intrinsically different but funda- 
mental questions crop up. 

1. How does the motion of a reference frame affect physical 
phenomena observed in it, ie. do the physical laws change on 
transition from one such frame to another? 
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2. Suppose we observe a concrete physical phenomenon by 
means of instruments resting in a certain reference frame, and 
obtain some values as a result of measurements of physical 
quantities characterizing this phenomenon. The same phenomenon 
can be observed in another reference frame moving relative to the 
first one. The measurements conducted in the second coordinate 
system will give us certain numbers defining the same physical 
quantities. How do these quantities correlate? 

It is important to note here that the same phenomenon is ob- 
served in both systems. We must know how to correlate these 
quantities. After all, a reference frame is an artificial construction 
created for measurement purposes. The phenomenon itseli, just 
as the laws of nature, cannot be affected by the choice of a ref- 
erence frame. The natural phenomenon is the objective reality 
existing outside our senses and measurements. 

Of course, the results of measurements may prove to be different 
in different reference frames but in any case we must know how 
to convert the results of observations obtained in one frame into 
those that are obtained, or can be obtained, in another. In short, 
we need a method to transform results of measurements. How can 
such a method be found? 

The answer to the first question leads us to the principle of rel- 
ativity and via Newton’s laws helps to distinguish a special class 
of reference frames, that is inertial frames (§ 1.5). The answer 
to the second question is given by the rules for the transformation 
of coordinates of an event, i.c. the Galilean transformation in 
classical mechanics (§ 1.3) or the Lorentz transformation in rel- 
ativistic mechanics (§§ 2.4, 2.5, 2.7). 

§ 1.3. The Galilean transformation. The transition from one 
reference frame to another one moving relative to the first one 
was performed long before the advent of the theory of relativity. 
Apparently the first to use the technique was Huygens who treated 
in this way the problem of the collision of spheres. For the sake 
of brevity we shall designate the reference frame by the letter K, 
and provided there are several frames we shall introduce super- 
scripts (K°, K’, K”, ...). We have mentioned that an event can 
be considered as an “element” of some physical phenomenon. It 
is natural to begin with the conversion of the quantities charac- 
terizing an event when a transition from one reference frame to 
another takes place. From now on “the transition from one ref- 
erence frame to another” will be everywhere understood as the 
consideration of those reference irames which move relative to 
one another. A shift of the origin as well as a rotation of coor- 
dinate axes will not be taken for a “transition”. 

In an arbitrary reference frame K an event is described by the 
four numbers: x, y, z, ¢. The first three of these are the coordinates 
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of the point at which the event happened, while the last one 
specifies the moment of time at which it happened. We want to 
know how the same four numbers x’, y’, 2’, t’ look in another rei- 
erence frame K’ moving relative to the frame K. 

From the very beginning we are compelled to restrict our prob- 
lem and consider only those reference frames which move uni- 
formly and rectilinearly relative to one another, and do not ro- 
tate around the origin. In other words, none of the considered 
reference frames moves with 
an acceleration relative to 
another frame. Somewhai 
later it will become clear 
that we deal here with the 
collection of so-called iner- 
tial frames of reference. 
However, since such refer- 
ence frames can be discrimi- 
nated only on the basis of 
Newton's laws, we shall post- 
pone the definition and iden- 
tification of such frames till 
§ 1.5. For the present, we 
shall consider the two frames 


Fig. 1.1. The two reference frames K and ” : ‘ 
K’ with the arbitrarily directed axes x, y, K and K’ moving uniformly 


z and x’, y’, 2’. The frame K’ moves rela. nd rectilinearly (transla- 
tive to K at the velocity V. The radius tionwise) relative to each 
vector of the point M is equal to the other, in terms of geometry. 


vector rin the frame K ane tor’ in the Jet us assume that the re- 
frame K’. According to the vector sum- , 
mation rule r =r’ +R, where R is tne ference frame K’ moves re- 


radius vector of the origin O’. lative to K at a velocity V. 
Suppose that at a given mo- 
ment of time ¢ the radius vector of the point M in the frame K’ is 
equal to r’. Then it can be seen from Fig. 1.1 that r°’=r—R, 
where r is the radius vector of the same point in the frame K, and 
R is the radius vector of the origin of the coordinate system K’ 
taken from the origin of K. This relation is valid for any moment 
of time and R varies according to the familiar law R = Vt+ R,, 
where Ro is the radius vector specifying the location of the origin 
O’ at the moment of time ¢ = 0. Taking into account that at the 
moment ¢ = 0 both origins coincide, R = Vi, and we obtain the 
coordinate transformation law in the vector form: 


r'=r—Vi, (1.1) 
where the components of the vector V are defined in the frame K. 


Now we can resort to isotropy of space and rotate each of the 
systems K and K’ around its respective origin. It is convenient 





x 
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to perform this in the following way. First, rotating the reference 
frames, we orient the x and x’ axes along the direction of the 
relative velocity of these frames K and K’. Then rotating the 
frames around the common axis x, x’, we orient axes y, y’ and 
z, 2’ in parallel to each other. In such a way, having lost none of 
the generality in terms of physics, we come to the relative posi- 
tion of the coordinate systems shown in Fig. 1.2. In this case the 
velocity V has the components (V, 0, 0). The origin of the sys- 
tem K’ slips along the common axis at the velocity V, while at 





t 
/ / = sae 
Zz 


Fig. 1.2. The two reference frames K and K’ with parallel axes move relative 

to each other at the velocity V (V is the velocily of motion of K’ relative to K). 

In classical physics coordinales of an “event” are transformed from the refer- 

ence frame K to K’ according to the formulae of the “Galilean transformation”: 
Ywaex—V yay 22, =. 


the initial moment of time both origins coincided. It is seen from 
the vector formula (1.1) or directly from Fig. 1.2 that the rela- 
tionship between the coordinates of the point M, where we believe 
the event happened, in the systems K and K’ is determined by 
the following equations: 


xa=x— Vi, y’=y, 2 =2. 


Now, in order to establish fully what are the coordinates of the 
event in the system K’, one has to know the time of the event 
by the clock of the system K’ (now we have two clocks: one in 
the system K and another in the system K’). Since in both systems 
we employ infinitely fast signals, the finite relative velocity of 
the systems is inessential to such signals. Indeed, the infinite ve- 
locity remains infinite in both systems. Consequently, the time of 
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the event registered by the clocks of both systems will be the 
same, i.e. ¢ = ¢’. This conclusion is confirmed by our own “com- 
mon sense”, because we do not detect any influence of the motion 
on the clock rate in everyday life. But we should bear in mind, 
however, that infinitely fast signals were only assumed, and al- 
though the common sense does not deceive us in everyday life, 
we must be prepared to the fact that in the case of a finite ve- 
locity it may turn out that ¢ # ¢. 

But within the scope of classical mechanics, as we have 
established by now, the formulae of transformation from the 
“coordinates” of an event determined in the system K (x, y, z, ¢) 
to those of the system K’ (x’, y’, 2’, t’) may be written as follows: 


x =x—Vi, 


y=y, 
Wass (1.2) 
(=F, 


Naturally, these equations are valid only for the relative position 
of the reference frames shown in Fig. 1.2. The transformation of 
the event “coordinates” from the frame K to the frame K’, as 
given by Eqs. (1.2), is called the Galilean transformation. We 
would like at once to draw readers’ attention to the fact that time 
turns out to be the fourth coordinate of an event so that when 
speaking of coordinates of an event, we imply four numbers (x, y, 
z, f). This is done not only for the sake of speech brevity. In the 
special theory of relativity such a terminology gains a complete 
justification (see Chapter 4). 

We have already pointed out the equivalence of frames moving 
uniformly and rectilinearly relative to one another. The frames K 
and K’ that will be referred to from now on differ only in the 
velocity of K’ relative to K being equal to V, whereas the velocity 
of the frame K relative to K’ is equal to —V. Hence, in order to 
get the reverse transformation formulae, it is sufficient to make 
primed and unprimed quantities change place, having changed the 
sign of V in the process. We get 


x=x'4eV', 

y=y’, 

as (1.3) 
(=, 


Surely, these same equations can be derived in a direct algebraic 
manner. 

Note one of the consequences of ihe Galilean transformation. 
Suppose two events occurred in the frame K on the x axis: one 
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at the point x, at the inoment ¢,, and the other at the point x2 at 
the moment f2 (¢ = f2). Is it possible to select the frame K’ in 
which both events would happen at the same point in space? Let 
us find the x coordinates of these events in the frame K’: x{= 
= x, — Vi, x3 = x2 — Vtg; and compose the difference x3—x{=x.— 
—x,—V(t.—4,). Having bidden x3—x{f=0, we obtain the 
equation from which the velocity of the frame K’ relative to K is 
determined: V = (x2 — x,)/(f2— #1). The meaning of the result 
is very simple: during the time period tg —¢, the frame K’ suc- 
ceeds in bringing the point x{ to the place where the second event 
occurred by a requisite moment. We see that it is always possible 
to select a frame K’ satisfying the required condition. It is pos- 
sible, however, only because classical mechanics permits the ve- 
locity V to have any magnitude. In the theory of relativity, where 
the velocity of a reference frame, just as any other material ob- 
ject, is limited, the required frame is far from being always found. 

Before proceeding to the Galilean principle of relativity, let us 
agree on one term. For the ease of speech “various observers” 
or “observers in different reference frames” are often mentioned. 
In the past such a terminology provoked blustering arguments, 
because there were many who imagined that it implied a subjec- 
tive approach to physical measurements. But the presence of an 
observer is not at all mandatory as far as measurements are con- 
cerned: they can be taken by means of instruments and without 
man’s assistance. It is indeed the case, for example, with space- 
ships, even when there are people aboard. “An observer from a 
frame K” is, in fact, taken to mean a set of instruments resting 
in this frame. One should not be surprised by the fact that instru- 
ments placed in different reference frames will give different re- 
sults for measured quantities associated with one and the samic 
phenomenon inasmuch as relative motion is a fundamental physi- 
cal quality. Objectivity of laws of nature manifests itself when 
from results of observation in one reference frame one can find 
results of observation of the same phenomenon in any other 
frame. One may hope that after these remarks the appearance of 
an “observer” on the pages of this book will not give rise to any 
objections. 

§ 1.4. The Galilean principle of relativity. Newton’s second law. 
The Galilean principle of relativity pertains to mechanical pheno- 
mena exclusively; it was the first step toward the establishment 
of the principle of relativity that later embraced all physics. Ga- 
lileo noticed that uniform and rectilinear motion does not affect 
mechanical phenomena. It is necessary to formulate precisely 
what it means. As we already know, a reference frame is needed 
to describe any physical phenomena, including mechanical ones. 
Let us consider two reference frames moving uniformly and recti- 
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linearly relative to each other and let us conduct a “mechanical 
experiment” in one of these frames. For example, we shall study 
the motion of a mathematical pendulum or the free fall of bodies. 
The principle of relativity states that identical experiments con- 
ducted in the two frames mentioned yield identical results. Hence, 
it is impossible to detect relative motion of the frames by means 
of such experiments. Of course, relative motion is easy to detcct 
in many ways provided the experiments of another kind are 
undertaken. The first formulation of the principle of relativity can 
be found in Galileo’s book Dialogue Concerning the Two Chief 
World Systems — Ptolemaic and Copernican (1632). This formu- 
lation is of purely qualitative nature. We shall quote a short ex- 
tract from this book illustrating the essence of the problem: 

“Shut yourself up with some friend of yours in the main cabin 
below decks on some large ship, and have with you there some 
flies, butterflics, and other small flying animals. Have a large 
bowl of water with fish in it: hang up a bottle that empties drop 
by drop into a wide vessel beneath it. With the ship standing 
still, observe carefully how the little animals fly with equal speed 
to all sides of the cabin. The fish swim indifferently in all direc- 
tions; the drops fall into the vessel beneath; and, in throwing 
something to your friend, you need throw it no more strongly in 
one direction than another, the distances being equal; jumping 
with your feet together, you pass equal spaces in every direction. 
When you have observed all this carefully (though there is no 
doubt that when the ship is standing still everything must happen 
in this way), have the ship proceed with any speed you like, so 
long as the motion is uniform and not fluctuating this way and 
that. You will discover not the least change in all the effects 
named, nor could you tell from any of them whether the ship was 
moving or standing still.” 

The significant consequences follow from the qualitative for- 
mulation of the Galilean principle of relativity, which is the iden- 
tity of results of identical mechanical experiments conducted by 
two observers moving uniformly and rectilinearly relative to 
each other. Indeed, if the laws governing mechanical phenomena 
are known and all identical mechanical experiments produce tie 
same result regardless of a reference frame chosen, the laws of 
mechanics must also be identical in such frames. In other words, 
the equations of mechanics must be the same in all reference 
frames moving uniformly and rectilinearly relative to each other. 

Thus, the principal equations of mechanics written via coordi- 
nates and readings of a clock of its own reference frame must 
have the same form. At the same time it is clear that many quan- 
tities vary on transition from one reference frame to another. In- 
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deed, let us examine the motion of a particle * in the frame XK. 
Usually it is defined as a time dependence of a radius veclor 
r =r(t). According to Eq. (1.1) the motion of the same particle 
in K’ is defined by the variable radius vector r’(f) = r(t)— Ve. 
Differentiating both sides of the last equation with respect to ¢ 
and taking into account that dr’/df =v’, and dr/dt =v, we ob- 
tain 

v=v—V. (1.4) 


Hence the velocity of the particle in the frames K and K’ is 
different. The quantities that vary on transition from one coordi- 
nate system to another are called relative. Thus x coordinates and 
the velocity of a particle are relative quantities. Its acceleration, 
however, is the same in both frames, K and K’. This becomes 
evident immediately after differentiating Eq. (1.4). 

ae = (V = const). 

The fact that the acceleration of objects is the same for all ob- 
servers in frames moving uniformly and rectilinearly relative to 
each other, is immediately evident. But this result makes it pos- 
sible for us to understand the statement that “the equations have 
the same form in all reference frames”. The fundamental equation 
of classical mechanics is that expressing the second law of New- 
ton. This equation relates a force F acting on a body and the ac- 
celeration acquired by it due to the action of this force: 


mo =P; (1.5) 
the factor m is called the mass of a body. 

If the laws of mechanics in all reference frames moving uni- 
formly and rectilinearly relative to one another are really the same, 
Eq. (1.5) has to retain its form in all reference frames of this 
kind. It is not difficult to see that this is really the case. We have 
already shown that acceleration is the same in all reference 
frames being investigated. But what happens to forces on tran- 
sition from one reference frame to another? Suppose we investi- 
gate two objects: I and II. Let the force of their interaction 
depend on the distance between them, their relative velocity and 
time. But the Galilean transformation does not change any of 
these quantities. Indeed, let us write out the coordinates and ve- 


* What is meant is a small object possessing a mass, but still so minute that 
there is no need to take into account its rotation. In mechanics, in this case, 
they speak of a mass point, bul since we shall have to deal with points in space 
far too much, the term “‘particle” is preferable. 
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locities of objects I and II in the frames K and K’, using the 
Galilean transformation: 











Coordinates snd 4 Transformation of 
veloeitics a eblerts Coordinates in K velocities 
‘ ‘ / , 
] xy Yn 25 7 x=xi—Vi, y= Z,=21 v=0—-V 
/ / 7 f 
II Xa, Y2, £25 02 X,=Xa—Vi, y,=Yy2, 2,=22 v,=0,—V 





At once it becomes evident that x2.— x, = x3—x{, yo—-y = 
=Y3— Yi, 22— 2, =2z2—2}. And in so far as the distance 
between the bodies is equal to 


MV (x2 — x1)? + (Y2 — yi)? + (22 — 2)? in the frame K 
and 


Ng — P+ (YH —-H) + (i — 2)? in the frame K’, 


it is clear that it remains constant on transition from K to K’. As 
to the relative velocity, 


’ Piste 
VU, 0, =, — U, 


i.e. it remains permanent. In accordance with the Galilean trans- 
formation time is invariant: ¢ = t’. Consequently, the forces de- 
pendent on the variables cited do not at all vary on transition 
from K to K’. But the forces considered in mechanics depend 
either on a distance (gravitational forces, forces of electric inter- 
action, elastic forces) or on a relative velocity (friction forces). 
Hence, forces occurring in mechanics stay permanent under the 
Galilean transformation. Inasmuch as all the quantities appear- 
ing in Eq. (1.5), accelerations and forces, do not vary under the 
Galilean transformation, the fundamental equation of classical 
mechanics, the second law of Newton, relating forces and accele- 
rations, has the same form in the frames K and K’ and differs 
only in the designations of variables. (Surely, it is assumed that 
a mass is a constant quantity; the mass invariance is one of the 
basic postulates of classical mechanics.*) The equation describ- 


* Note that in Newtonian mechanics motion of bodies of a variable mass 
can be examined, e. g. when jet propulsion or motion of a drop accompanied by 
condensation is studied. But in all these cases a body either donates substance 
to the environment or acquires from it. When mass variability is mentioned in 
the STR (see Supplement IV), it is implied that a mass of a body stays con- 


stant in the resting frame, i.e. there is no mass exchange between a body and 
its environment. 
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ing the second law of Newton in the frame K has the following 
form, provided the force depends on a distance and time: 


d’*r 
Mae =F (ry, t). 


Accordingly, in the frame K’ 
¢ d? , ‘ 
m St =F (rin t ). 

An equation which does not change in case of a transforma- 
tion of variables appearing in it, ie. an equation with its terms 
invariant, is called invariant with respect to a given transforma- 
tion. Thus we have shown that the equation describing the second 
law of Newton is invariant with respect to the Galilean trans- 
formation. 

Now we can formulate more precisely the terms on which 
“identical experiments produce identical results’. Newton’s Eq. 
(1.5) is an ordinary differential equation of the second order. Its 
solutions describe motion of the system. To make the solutions 
of Eq. (1.5) coincident in the frames K and K’, i.e. to ensure the 
“identity” of motion, it is necessary for the initial conditions to 
coincide. The invariance of the basic equation of mechanics en- 
sures that mechanical phenomena proceed alike in all reference 
frames moving uniformly relative to one another, only when the 
initial conditions coincide in these frames. 

When the initial conditions for the same phenomenon differ in 
diverse reference frames, the phenomenon itself will look different. 
For example, while a raindrop falls down vertically from the 
viewpoint of an observer standing on a platform, the same rain- 
drop will move along a parabola from the viewpoint of an ob- 
server in a train. (We suppose that a raindrop falls down with 
an acceleration.) However, the initial data in these frames were 
different. From the viewpoint of an observer in a train the raindrop 
had initially a horizontal component of a velocity. An observer 
standing on a platform had to assume that at the initial moment 
a raindrop had no horizontal component of a velocity whatsoever. 

We have already mentioned above that in Newtonian mechanics 
an interaction between bodies is assumed to be transmitted in- 
finitely fast. Now we are able to explain this in detail. “Interac- 
tion” between bodies is specified by forces. In classical mechanics 
forces are regarded as being dependent on distances between bo- 
dies. The same is assumed to be correct for bodies moving relative 
to one another. But a distance between two moving bodies has 
to be put down as 


rie= V[x0(t) — x1 OP + [ye ) — m OP + lee) — 21 OP: 
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Assuming that an interaction, that is a force, is transmitted at 
a finite velocity, one cannot presume that the equation defining 
this force still incorporates riz. If we want to find the force that 
is exerted on body / by body //, the position of body // should be 
registered not at the moment ¢ but earlier by the time interval 
needed for the interaction to be transmitted from body // to body 
!. When this time lag is ignored, it means that the velocity at 
which the interactions are transmitted is assumed infinitely fast. 
It is precisely how problems are tackled in Newtonian mechanics. 

The same thing happens when a potential energy is introduced. 
Having written down a central force for the interaction of two 
particles as usual in the form 


F=—W(ri()—re() d, 


we explicitly ignore the time lag in the interaction transmission. 

The instantaneous transmission of interactions, formerly re- 
ferred to as a long-range action, appears amazing and obscure 
to us. The transmission of any signal, ie. an impulse or energy 
capable of accomplishing some action, e.g. switching on a certain 
device, requires some time. As our experience teaches us, it is 
impossible to transmit a signal from “here” to another place 
(“there”) instantaneously. Yet in Newton’s time no other idea 
except the long-range action could emerge, as far as a transmis- 
sion of interactions is concerned. The finite velocity of the trans- 
mission of interactions appeared together with the concept of a 
field which was introduced into the theory of electromagnetism 
by Maxwell. In Maxwell's theory the interaction of charges or 
currents is realized through a field to which an independent ex- 
istence is attributed. It follows from the theory that a field propa- 
gates at a finite velocity. This means that the velocity of propaga- 
tion of interaction is the same as that of the field. The propaga- 
tion velocity of an electromagnetic field in vacuo plays a funda- 
mental role in the theory of relativity. It is designated by the 
letter c and is approximately equal to 3-108 m/s. Since field vari- 
ations are transmitted from point to point, field theories are 
referred to as short-range ones. As we shall see later, the theory 
of relativity rejects the long-range action as a matter of principle. 

§ 1.5. Newton's laws and inertial frames of reference. The basic 
laws of mechanics, Newton’s laws, make it possible for us to 
distinguish among all conceivable reference frames, the special 
class of frames in which not only laws of mechanics but also all 
other physical laws look particularly simple. These are the so- 
called inertial frames of reference. An inertial frame of reference 
is a frame (or rather frames, since it will turn out later that there 
are an infinite number of them) in which all three laws of New- 
ton are valid. 
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We begin by showing how important the first law of Newton 
is for the discrimination of inertial frames of reference among all 
others. The first law of Newton, the law of inertia, claims that a 
body subjected to no forces moves due to inertia, i.e., uniformiy 
and rectilinearly. Frequently one could hear, or even read in a 
textbook, that the first law is not an independent statement, but 
only a consequence of the second law. 

Formally it is the case. The resultant of all forces acting on 
a body appears in the right-hand side of Eq. (1.5). The second 
law just claims that an acceleration acquired by a body is directly 
proportional to this resultant and inversely proportional to its 
mass. It follows from Eq. (1.5) that if the resultant of all forces 
is equal to zero, or there are no forces whatsoever, the body gains 
no acceleration. And if a body gains no acceleration, it either 
moves uniformly and rectilinearly or is at rest. It used to be con- 
cluded from this that the law of inertia could be obtained from 
the law of dynamics. 

Then why was it necessary for Newton to formulate the law 
of inertia separately? It is doubtful that Newton did not realize 
that the law of inertia is a consequence of the law of dynamics. 
The problem is more complicated than it may seem at first sight 
Newton understood very well that neither Eq. (1.5) nor the law 
of inertia can be equally valid in all reference frames. It is not 
accidental that the definition of an inertial frame involves all 
three laws of Newton. Let us recollect the third law: to every 
action there is an equal and opposite reaction. This law empha- 
sizes that all forces in Newtonian mechanics are intrinsically as- 
sociated with an interaction between bodies. 

Let us examine one useful, even though very plain example. 
Let a body be at rest in an inertial frame of reference K. Then 
according to the second law of Newton no forces act on this body. 
Without touching it let us consider it from the viewpoint of an 
observer moving relative to the frame K with an acceleration a. 
This observer will note that the body in question moves relative 
to him with an acceleration —a. If the second law of Newton were 
valid in his frame, he could say that the body experiences the 
force —ma. But we know from the observer in an inertial frame 
of reference that there is no force acting on the body. Therefore, 
the second law of Newton is merely not valid in the reference 
frame moving relative to the inertial frame with an acceleration. 
Many readers have already realized, of course, that passing into 
the reference frame moving at an accelerating velocity, we detect 
“a force of inertia” which is not actually a force in Newtonian me- 
chanics (see Supplement V). Since the laws of Newton are not 
valid in all reference frames, Newton had to point out that a cer- 
tain reference frame was available in which all these laws were 
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valid. And the first law of Newton is, in fact, equivalent to this 
statement. This law postulates that an inertial frame of reference, 
i.e. a reference frame in which the law of inertia is valid, is 
available. In other words, one can find a reference frame in which 
a body that interacts with no other bodies, moves due to inertia, 


i. e. uniformly and rectilinearly. 
The law of inertia represents a special case of the law of con- 
servation of momentum. On the one hand, it is a consequence of 


The Earth 
rotation axis 





Fig. 1.3. The Foucault experi- 
ment designed to detect one of 
IFRs. For the sake of simplicity 
the drawing illustrates the Fou- 
cault experiment being perfor- 
med at the Pole In fact, the ex- 
periment was conducted in Pa- 
ris, but this circumstance does 
not change the matter. 


the second and the third laws of 
Newton, and on the other hand, a 
consequence of the second law and 
the assumption about uniformity of 
space (equivalence of all its points), 
i.e. Newtonian mechanics assumes 
the uniformity of space in any iner- 
tial frame of reference. 

Now suppose we have found one 
inertial frame of reference. Then ac- 
cording to the Galilean principle of 
relativity all reference frames mov- 
ing uniformly and rectilinearly rela- 
tive to it will be inertial as well. 
Therefore, it is clear that there is an 
infinite number of inertial frames of 
reference. 

How does one find at least one 
inertial frame of reference? Of course, 


the discovery of such a frame is a 
matter of experience. The famous pendulum experiment first con- 
ducted by Foucault is suitable for the purpose. For the sake of 
simplicity we shall describe the experiment the way it could be 
conducted at one of the Earth’s poles (Fig. 1.3). A heavy ball is 
suspended on a thread which is attached to a frame constructed 
at the Pole. The point of the pendulum suspension is located on 
the Earth’s axis. The attachment of the thread is free and so tha 
frame does not carry the thread along in the process of rotation 
around the Earth’s axis. The equilibrium position of the pendulum 
thread coincides with the Earth’s axis. If one deflects the pendu- 
Jum from the equilibrium position and then lets it go without im- 
parting an initial velocity, it will start oscillating in a certain 
plane. The two forces acting on the pendulum are the gravita- 
tional force mg and that of the tensile stress T of the thread. Both 
forces act in the plane P of the pendulum oscillations and cannot 
remove the pendulum from that plane. If the second law of Newton 
were Strictly valid on the Earth, the plane of the pendulum oscil- 
lations would maintain its orientation relative to the Earth. Bul 
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in the experiment the Earth withdraws from under the pendulum 
and thereby “registers” the fact that in the coordinate system as- 
sociated with the Earth the sccond law is not, strictly speaking, 
valid. 

One should not be too much annoyed hecause of this, since 
Newton’s laws still can be used on the Earth to great advantage. 
It is evident, for example, from the fact that the whole engineer- 
ing and theoretical mechanics rely on the second law of Newton 
without any corrections. Surely, this is because corrections are 
small: they are caused by the Earth rotation which is not very 
fast. Therefore, the Earth can be treated as an icertial frame even 
in a school textbook. 

But fundamentally the Earth is not an inertial frame. An iner- 
tial frame involves such a coordinate system relative to which 
the plane of pendulum oscillations remains constant. This plane 
can be found from the same Foucault experiment. The system 
turns out to be rather “exotic”. Its centre is located in the Sun 
and the three coordinate axes arc directed to the “stationary” 
stars, ie. the stars moving rigidly together with the so-called 
celestial sphere. Due to the singular role of the Sun the inertial 
frame based on this system is referred to as heliocentric. In the 
choice of an inertia! frame most important is the choice of direc- 
tions for coordinate axes. The choice of the origin in the centre 
of inertia of the Sun is convenient because the Sun possesses the 
largest mass in the whole solar system. The motion of planets 
appears particularly plain in this frame. Note that the axes of the 
heliocentric reference frame do not participate in the rotation of 
the Sun. By the way, the reference frame with the coordinate axes 
fixed rigidly to the Earth, i.e. rotating with it, is referred to as 
kala a As Foucault’s experiment showed, this frame is non- 
inertial. 

Thus, the Newtonian laws of dynamics are applicable in the 
heliocentric frame. In accordance with the Galilean principle of 
relativity the laws of Newton are equally valid in all reference 
frames which move uniformly and rectilinearly relative to the 
heliocentric one. We shall refer to all these reference frames as 
inertial frames of reference *. Although the number of inertial! 
frames of reference is infinite, they still get lost among all feasible 
kinds of frames. If it were possible to gather al! kinds of frames 
into a sack and then to draw out of it one frame at random, we 
would get most likely a non-inertial frame. 

Foucault’s experiment is far from being the only one permitting 
of detecting a deviation of the geocentric reference frame from 


* Hereinafter we shall often abbreviate the term “inertial frame of reference” 
to the initial letters, i.e IFR. 
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an inertial one. We shall indicate another experiment of this kind. 
When a heavy object is dropped from some height, it does not 
fall vertically down as it should due to the gravitational force, 
but deviates slightly to the east. The deviation of the motion of 
free falling objects from the vertical makes it possible to detect 
the non-inertial nature of the geocentric frame and to find an 
inertial frame of reference. . 

In mechanics there is one more conservation law for closed-type 
systems which is the conservation of moment of momentum. It is, 
just as the law of conservation of momentum in a closed-type 
system, the consequence of the second and third laws of Newton. 
Moreover, it can be obtained as a consequence of the second law 
and the assumption about the isotropy of space. This implies that 
Newtonian mechanics presupposes the uniformity of space. 

The law of conservation of energy for closed-type systems turns 
out to be a consequence of the second law of Newton and the as- 
sumption about a potential character of forces acting between 
particles constituting the system. On the other hand, it stems 
from the motion equations of the system and the assumption 
about the uniformity of time. It follows that in Newtonian me- 
chanics the uniformity of time is presupposed. 

That is why an inertial frame of reference can be determined 
as one relative to which space is uniform and isotropic, and time 
uniform. 

Sometimes an inertial frame of reference is defined as a frame 
fixed rigidly to a free-moving object. Although this definition is 
basically true, it cannot be practically used for the purpose of an 
experimental identification of an IFR. There is no “free-moving” 
object at our disposal, since the gravitational force cannot be 
cancelled. Consequently, it is more correct to define an IFR as a 
frame in which all three laws of Newton are valid. 

Inertial frames are distinguished among other, non-inertia!, 
reference frames not only in mechanics. An electric charge does 
not radiate electromagnetic waves when at rest in an inertial 
fraine of reference, whereas in a non-inertial frame it does. 

Inertial frames of reference play a tremendous role in physics. 
It is for these frames that familiar laws of physics are recorded. 
The transition to non-inertial frames is associated with consid- 
erable difficulties. The special theory of relativity instructs us 
how to describe all kinds of physical phenomena in any inertial 
frame of reference *. But what does it mean and how is it practi- 


* It must be stressed that the STR can also be formulated for non-inertial 
frames of reference In fact, the STR can be employed in any reference frame, 
as long as there are no gravitational forces, i. e. in a plane four-dimensional 
space-lime. However, the form of the STR suggested by Einstein and to-be 
developed in this book is applicable only to inertial frames of reference. 
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cally carried out? We have much to discuss before we can get 
answers to these questions. 

§ 1.6. Absolute time and absolute space. Although in deriving 
the Galilean transformation we have, in fact, already spoken of 
everything that was meant in that transformation by space and 
time, we shall repeat the pertinent statements. Usually, when 
“classical” physics is mentioned, Newtonian mechanics is implied 
for the views of Newton of space and time reflect precisely the 
classical approach to these concepts. Newton’s ideas are worth 
dwelling on more carefully because they correspond to our every- 
day experience. and are customary and comprehensive, while the 
transition to the concept of space and time inherent in the special 
theory of relativity presupposes renouncing these ideas. In addi- 
tion. a still more decisive step further from these concepts was 
made by Einstein in his theory of gravitation which is sometimes 
referred to as the general theory of relativity. This is what one 
can read in Newton’s Philosophiae Naturalis Principia Mathema- 
tica (1687): “Absolute space, in its own nature, without relation 
to anything external, remains always similar and immovable.” 

So. according to Newton, space represents a giant empty box 
which contains material objects and where physical phenomena 
take place. At the same time Newton was aware that the Galilean 
principle is valid in mechanics. And this indicates that the states 
of immobility and uniform rectilinear motion are equivalent. Then 
how should one single out “motionless absolute” space? 

Of course, it is impossible to single out “motionless absolute” 
space just by observing mechanical phenomena. Detection of ab- 
solute space and absolute motion involves studies ottside the 
scope of mechanics. Such a detection is assumed to be possible 
in the process of interpreting optical phenomena. Consequently, 
in the historic essay dedicated to the interpretation of some ex- 
perimental facts (see Supplement II) we shall presume that New- 
ton’s privileged, selected reference frame, that is motionless absn- 
lute space, is the heliocentric frame. Finally, it will be clear, though, 
that there is no such thing as a privileged frame at all, but there 
is a whole privileged class of reference frames in which laws of 
physics appear particularly simple. Such is the class of inertial 
frames of reference. 

Now let us see what Newton wrote about time: 

“Absolute, true, and mathematical time, of ilself, and from its 
own nature, flows equably without relation to anything external, 
and is otherwise called duration.” 

Again we come across the statement that time is something 
external relative to nature. Thus, in accordance with Newton's 
ideas, time and space exist by themselves and do not depend on 
material bodies located in space. Surely, Newton's concepts of 


30 Special Theory of Relativity 





space and time seem very scholastic to us. However, they should 
not be underestimated. Here is a short excerpt from the book [11]: 

“In conversations with one of the authors of this book at var- 
ious times over the years, Einstein emphasized his great respect 
for Newton and, in particular, his admiration for Newton's cour- 
age. He stressed that Newton was even better aware than his 
17th century critics of the difficulties with the ideas of absolute 
space and time. However, to postulate those ideas was the only 
practical way at that time to get on with the task of describing 
motion”. 

Of course, the natural question arises: why does classical me- 
chanics based on such concepts of space and time that can hardly 
be explained, function so efficiently? It turns out, however, that 
these concepts are approximately correct and the departures from 
them in everyday life are quite insignificant. The departures from 
the classical ideas become clearly visible only when micropartic- 
les are investigated and also in outer space conditions which 
modern physics has already begun studying. Such investigations, 
however, require special conditions and sufficiently complex equip- 
ment. 

To end this brief section it is necessary to give a concise pre- 
sentation of the up-to-date approach to the problem. From the 
modern point of view there is no absolute space and, consequently, 
no absolute motion. All inertial frames of reference are equivalent. 
The special theory of relativity shows that time readings for 
events prove to be different in different inertial frames of reference. 
Thus, time reading is found to depend on the state of motion. The 
gravitational theory of Einstein goes still further. In terms of this 
theory properties of space and time are not prescribed for ever 
but are specified by objects located in space. Since in accordance 
with dialectic materialism space and time are forms of existence 
of matter, the conclusions of Einstein’s theory of gravitation ap- 
pear far more satisfactory than the Newtonian concepts of space 
and time. 

§ 1.7. How physics was approaching the theory of relativity. 
From the point of view of modern physics it is useful to trace 
how relativistic effects were showing well before the crea- 
tion of the special theory of relativity. This section does not 
claim to be an historic essay (Supplement II is closer to that). It 
is intended only for promoting the understanding of the next two 
sections, where, in fact, the first principles of the theory are pre- 
sented. 

No doubt, the first step to the development of the special theory 
of relativity was the discovery by Galileo of the principle of rela- 
tivity for mechanical phenomena. 

The natural question arises: why did Galileo confine his prin- 
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ciple within the framework of mechanics? The answer is very 
straightforward: in Galileo’s times there were just no “other 
branches of physics” as we call them now. In fact, mechanics re- 
presented the whole physics. If one also takes into consideration 
that all physica) phenomena were attempted to be explained on 
the basis of mechanics almost till the end of the 19th century, it 
becomes clear that the principle of relativity formulated by Ga- 
lileo encompassed the “whole physics” at that time. 

The next important step along the road to the theory of rela- 
tivity was the establishment of the finiteness of the velocity of 
light. The conclusion was made by Roemer on the basis of his 
astronomical observations (1676). Before Roemer the velocity of 
light propagation was assumed to be infinite. 

The Galilean principle of relativity could be expressed in a 
mathematical form only after equations of mechanics had been 
written down (Newton, Philosophiae Naturalis Principia Mathe- 
matica, 1687). Since coordinates and time are the basic variables 
involved in equations of mechanics, their transformation on tran- 
sition from one reference frame to another, moving relative to the 
former, requires appropriate equations for transformation of co- 
ordinates and time under such a transition. It followed from the 
Galilean principle of relativity that the requisite transformation 
of coordinates and time should not alter the form of Newton’s 
laws (§ 1.4). This transformation is that of Galileo. 

In 1851 the Foucault pendulum experiment was performed at 
the Pantheon in Paris, which definitely demonstrated the Earth’s 
rotation and indicated the inertial frame of reference (§ 1.5). In 
fact, one could conclude the description of mechanical phenomena, 
linked directly with the theory of relativity, with this experiment. 

The Galilean principle of relativity, Newton’s laws and the Ga- 
lilean transformation are all closely interrelated. The direct con- 
sequence of the Galilean transformation is the classical formula 
for the velocity transformation (Eq. (1.4)): ov’ =w—V. In 1851 
Fizeau performed an experiment to show explicitly that this for- 
mula is not always correct. The Fizeau experiment with flowing 
water was schematically conducted as follows. In the reference 
frame K water was flowing along a tube at a velocity V, and the 
velocity of light in water was being measured. Now we can reason 
strictly in terms of kinematics. Let us fix the inertial frame K’ to 
moving water. In this frame the velocity of light v’ is determined 
by the familiar relationship v’ = cjn, where n is the refraction 
index of water. To find the velocity of light in the frame K, one 
can use Eq. (1.4), and then v= c/n-+ V. But Fizeau’s result, 
confirmed also by modern measurements, turned out to be 


om e+v(1— 4). 
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As it is seen from here, the classical Eq. (1.4) is not correct in 
this case. And this is exactly what we wanted to emphasize. As 
to the details of the experiment and its contemporary interpreta- 
tion, all that can be found in § 3.6. 

The theory of relativity owes very much to the Maxwell theory 
formulated first in several large articles and later published in 
the two-volume Treatise (1856-1873). That was the first field 
theory in which an interaction was supposed to be transmitted at 
a finite velocity, that is the field propagation velocity. The theory 
provided a quite definite value for this velocity which was speci- 


fically equal to 1/4/epbo in vacuo, where eo and yo are the electric 
and magnetic constants. Naturally, the question arose at once as 
to whether the principle of relativity was satisfied; in other words, 
whether the Maxwell equations retained their form under the 
Galilean transformation. One can easily check that the Galilean 
transformation changes the appearance of the Maxwell equations. 
Owing to this fact it was suspected that the principle of relativity 
did not extend to dynamics. Severa! decades were needed to re- 
alize what wonder of a theory Maxwell developed. Neither knowing, 
nor even suspecting anything about the theory of relativity, Max- 
well nevertheless developed his theory in a complete agreement 
with the requirements of the theory of relativity. 


Now, when we know full well where the influence of the theory 
of relativity can be “detected”, it is easy to come back to essen- 
tial facts. The theory of relativity reveals itself when velocities 
of objects get closer to that of light in vacuo, such velocities being 
referred to as relativistic. However, there are no macroscopic 
objects possessing relativistic velocities. Only microscopic par- 
ticles can travel at velocities close to that of light. The first micro- 
particle to be discovered was an electron (Thomson, 1894-1896). 
Thomson determined the ratio of the charge of an electron to its 
ass experimentally. His experiments were carried out in dis- 
charge tubes where electron velocities were far below that of light. 
But in !896 the natural radioactivity was discovered. The streatn 
of electrons found among radiations emitted by radioactive sub- 
stances was very soon identified with electrons in a discharge 
tube. Velocities of these electrons turned out to be close to that 
of light. When in 1902 Kaufmann investigated the motion of such 
electrons in electric and magnetic fields, the classical equation of 
motion, i.e. the second law of Newton, was found to describe their 
behaviour incorrectly. Thus, the departure from the Newtonian 
laws was observed for the first time. 

Summing up, it can be said that by the beginning of the 20th 
century it became obvious that Newtonian mechanics and the Ga- 
lilean transformation are not always true, and the fastest signals 
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of all known, that is light signals, are transmitted at a finite 
velocity. 

Although there are no macroscopic objects moving at a relati- 
vistic velocity, one relativistic object, light, was always at inen’s 
disposal. Naturally, optical experiments played a significant role 
in the history of the STR: the interpretation of optical experi- 
ments is associated with the emergence of a hypothesis concern- 
ing a “luminoferous medium”. Rejection of this hypothesis took 
much effort, but now it is worth mentioning only as a page in the 
history of physics (see Supplement II). 

§ 1.8. The generalization of the Galilean principle of relativity. 
The Galilean principle of relativity covered only mechanical phe- 
nomena. We found that the second law of Newton expressed in 
a differential form in combination with the Galilean transforma- 
tion satisfied the principle of relativity. From the formal point of 
view it implied that Eq. (1.5) remained invariant and only desig- 
nations of variables changed. Naturally, the question arises: wily 
must the principle of relativity cover only mechanical phenomena? 
Why is it impossible to believe that all physical phenomena hap- 
pen in the identical manner in all inertial frames, provided the 
initial conditions of these phenomena are identically specified? In 
other words, why is it impossible to assume al! inertial frames of 
reference to be completely equa! with respect to all physical phe- 
nomena? 

These questions did not worry physicists too much till the 
middle of the 19th century since they reduced all physics to me- 
chanics. But by the middle of the 19th century it became evident 
that physics cannot be reduced to mechanics. By the same tinie 
the conviction had grown as to the universal relationship between 
phenomena, and between physical phenomena in particular. The 
subdivision of physics into “mechanics”, “electricity”, “heat” etc. 
is justified by the fact that each group of phenomena possesses 
its own set of basic equations and so is caused by rather educa- 
tional requirements and is not intrinsically imperative. Looking 
more carefully into even “purely mechanical” phenomena, one can 
discern a manifestation of regularities of another kind. The colli- 
sion of billiard balls is always cited as a classical example from 
mechanics. But at the moment of collision, when the balls ar 
slightly flattened, the elastic forces defined by electromagnetic 
forces come into play. Hence, no “purely mechanical” phenomena 
can exist in nature. It follows that the principle of relativity must 
either cover “all physics” or be wholly incorrect. 

Thus, the extension of the principle of relativity to all physical 
phenomena was quite natural from the viewpoint of physics at the 
end of the 19th century. But such generalization of the Galilean 
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principle of relativity is exactly what is called the first postulate 
of Einstein, or the Einstein principle of relativity. 

However, the equations of electrodynamics were at once found 
to contradict the equivalence of inertial frames of reference. 

First of all, so far as the basic system of electrodynamic equa- 
tions, that is the Maxwell equations, is concerned, they alter their 
appearance under the Galilean transformation, i.e. do not retain 
their form, and it follows from here that electromagnetic pheno- 
mena are described differently in different IFRs. In other words, 
electromagnetic phenomena do not obey the principle of relativity. 
In particular, this means that in the reference frame in which the 
Maxwell equations are written down in the conventional forin 
(see Chapter 6), the propagation velocity of electromagnetic 


waves ¢ is equal to 1/4/eouo, while in all other reference frames 


moving relative to the first one, the velocity is different. But 
vacuum occupies a special! place relative to reference frames. In- 
deed, it is remarkable because it has no “medium” possessing 
the rest mass. One can always fix a reference frame to a material 
medium, i.e. single out such a frame in which the medium is at 
rest as a whole or in a limited region. But this particular reference 
frame is the chosen one. Another equivalent frame of reference 
moving relative to the first one must possess the same property 
of the motionless medium. But this creates a different physical 
situation. Thus, the presence of a medium always distinguishes 
one reference frame from all others. But it is impossible to single 
out such a system in vacuo because there is no reference frame 
in which vacuum is at rest. Consequently, all reference frames 
are cquivalent relative to vacuum. It follows logically from here 
that provided al! inertial observers are equal, the velocity of 


electromagnetic waves must be the same, 1/-/ obo» in all IFRs. 


As to the classical formula for the transformation of velocities, 
Eq. (1.4) shows that this is not the case. Let in an inertial frame 
of reference K the velocity of light in vacuo be equa! to c. Then 
in another inertial frame of reference K’ the velocity of light in 
vacuo c’ is equal to c — V. Hence, the velocity of light in vacuo c 


is equal tol/-+/eu) only in one privileged reference frame. Thus, 


the principle of relativily seemed to be incorrect for electromag- 
netic phenomena. 


The above reasoning was based on the fundamental assumption 
which was absolutely unacceptable for the 19th century physics: 
electromagnetic waves, i.e. light, can propagate in vacuo or, ex- 
pressed otherwise, no matter is needed for their propagation. This 
is a very difficult point to comprehend, when a transition from 
classical physics to relativistic is undertaken. 
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But what could be done in such a situation? Logically three 
possibilities were opening. 

(1) The principle of relativity could be assumed to cover only 
mechanics and have nothing to do with electrodynamics in which 
there is an “absolute” frame of reference. But, as it was men- 
tioned before, such a possibility is rejected when the general re- 
lationship of physical phenomena is taken into consideration. 

(2) The principle of relativity could be regarded to be univer- 
sally applied, and inasmuch as the system of Maxwell's equations 
does not satisfy this principle, that is it changes its appearance 
under the Galilean transformation, it should be discarded. But 
the system of Maxwell’s equations showed itself as a reliable and 
comprehensive theory within one inertial frame of reference, a la- 
boratory frame. On the other hand, Newtonian mechanics and the 
Galilean transformation associated with it did not prove to be 
always correct. Because of this it would be reasonable to keep the 
system of Maxwell’s equations. 

(3) If the principle of relativity is assumed to be applicable 
to all phenomena of nature, and the system of Maxwell’s equations 
correct, the transition from one inertial frame of reference to 
another cannot be described by the Galilean transformation which 
changes the form of Maxwell’s equations. On the other hand, a 
new transformation cannot leave the form of equations of me- 
chanics intact. Consequently, the equations of mechanics have to 
be changed so that the new transformation leaves them intact. 

The last possibility formulates concisely the programme which 
is realized by the special theory of relativity: (1) the principle of 
relativity covers all phenomena of nature, (2) the velocity of 
electromagnetic waves in vacuo is the same in all IFRs (this 
follows from the invariance of Maxwell’s equations). 

But how must a transformation of coordinates and time look 
like in order to mect both requirements set above? Such a trans- 
formation will turn out to be the Lorentz transformation and we 
shall examine it closely in the next chapter. In conclusion, we 
shall point out the following. 

As soon as the Galilean principle of relativity was extended to 
cover all physical phenomena, it turned into a genuine principle 
of physics. Evidently it is advisable to differentiate laws and prin- 
ciples of physics. When laws of physics are spoken of, their vali- 
dity for a limited scope of physical phenomena is implied. For 
example, Newton’s laws describe phenomena of mechanics. Max- 
well’s equations pertain to electrodynamics; and so they are the 
laws of electrodynamics. The three laws of thermodynamics deal 
with thermal phenomena. As to the principles of physics, they 
are universally important, for they cover all physical pheno- 
mena. 
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The most widely known principle of physics is that of conser- 
vation of energy. The famous book by M. Planck dedicated to 
conservation of energy is called The Principle of Conservation 
of Energy (1931). We believe that the law of conservation of 
energy is true for all physical phenomena, just as we are sure 
that the law of conservation of momentum is true for all physical 
phenomena. The principle of relativity occupies its place in physics 
pions with the principles of consefvation of energy and momen- 
um. 

§ 1.9. The velocity of light in vacuo. The velocity of light in 
vacuo occupies a special place in nature because in accordance 
with present-day conceptions it is the greatest possible velocity 
at which an interaction between objects can be transmitted. Trans- 
mission of an interaction, i.e. transmission of a certain action 
produced by one object onto another, is often referred to as trans- 
mission of a signal; it is this term that is especially popular in 
the theory of relativity. To transmit a signal means to transmit 
a momentum and energy (taken to be inseparable in the theory 
of relativity (see § 5.5)) which are capable of “switching on” a 
certain device, e.g. a trigger mechanism. 

It does not follow from anywhere that there exists an upper 
limit for the velocity at which signals can be transmitted in na- 
ture. However, both theory and experiment show that all known 
interactions propagate at a finite velocity; and the fastest velocity 
at which a signal is transmitted is that of light in vacuo. We 
shall recall that this is also the propagation velocity of electro- 
magnetic waves of any frequency in vacuo. As was already men- 
tioned, the classical theory assumed tacitly that a signal can pro- 
pagate infinitely fast. 

If one admits that there is an ultimate velocity of signal pro- 
pagation in nature, its absolute value must be the same in all 
inertial frames of reference. In fact, all these frames are equiva- 
lent according to the principle of relativity, and it is impossible 
to suggest a physical experiment to detect the difference between 
them. Had the velocity of interaction transmission been different 
in different inertial frames of reference, it would have been pos- 
sible to distinguish one inertial frame from another. This is im- 
possible, however, provided the principle of relativity is assumed 
to be universal. It follows immediately from this that the velocity 
of light in vacuo must be the same in all inertial frames of re- 
ference. 

And what if a source moves toward an observer or an observer 
moves toward a source? Such a motion cannot change the magni- 
tude of the ultimate velocity at which a signal is transmitted. 
Consequently, the velocity of light in vacuo cannot depend on the 
motion of either a source or an observer. 
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Obviously, the velocity of light in vacuo has unique properties. 
All velocities are relative, i.e. they change on transition from one 
inertial frame to another. But the absolute value of the velocity c 
remains the same. Although there is no privileged frame among 
all inertial frames, there is one privileged velocity in all of them. 
Both these circumstances are intrinsically associated with the fact 
that electromagnetic waves can propagate in vacuo. In other 
words, no material medium is needed for their propagation. Na- 
turally, the assumption concerning the privileged, i.e. invariant, 
velocity upsets drastically the classical arrangement, that is Ea. 
(1.4) and, hence, Eqs. (1.2) and (1.3). 

Nevertheless, no matter how strict or beautiful the logical 
reasoning is, an experiment was and will for ever be a supreme 
judge in physics. An experiment supports quite unambiguously the 
iollowing two statements: (1) in a given IFR the velocity of light 
in vacuo is equal at all points and in all directions; (2) in all 
IFRs this velocity has the same value. Here we refer to the Mi- 
chelson-Morley and Kennedy-Thorndike experiments described in 
Supplement II. 


CHAPTER 2 


THE EINSTEIN POSTULATES. 
THE INTERVAL BETWEEN EVENTS. 
THE LORENTZ TRANSFORMATION 


§ 2.1. Einstein’s postulates. The final part of the foregoing 
chapter was devoted to the explanation of two basic assumptions 
of the STR which are called the Einstein postulates. On account 
of their significance we shall repeat them once more here and 
supplement some comments. 

Postulate I. All identical physical phenomena proceed alike 
in inertial frames of reference in the case of equal initial condi- 
tions. In other words, there is no privileged frame among IFRs, 
and the state of absolute motion is impossible to find. 

This postulate extends the Galilean principle of relativity to all 
phenomena of nature. It puts an end to absolute space once and 
for all: since all inertial frames of reference are equivalent, they 
cannot have any privileged frame among them. It was just abso- 
lute space that served as such a privileged frame. The conception 
of “absolute” motion in vacuo which was meant as the motion 
relative to the absolute frame of reference is rejected exactly in 
the same way (sec § 1.6). 

Postulate JI. The velocity of light in vacuo is equal in all di- 
rections and in any region of a given inertial frame of reference, 
and equal in all inertial frames of reference. 

Often this ee is supplemented with the statement that 
the velocity of light in vacuo is not affected by the velocity of a 
source. This, however, follows immediately from Postulate II for- 
mulated in the form given above. Indeed, any source can have an 
inertial frame of relerence fixed rigidly to it. When a source 
moves non-uniformly and/or along a curved line, an instantaneous 
co-moving inertial frame can be found. In such a frame a source 
is at rest, while all other inertial frames move relative to it (and 
a source moves relative to them). Since in accordance with Postu- 
late Il the velocity of light is the same in all frames, it does not 
depend on the velocity of a source. As to the motion of the ob- 
server, it is the relative velocity of a source and an observer that 
is essential, so that the preceding reasoning disposes of the ques- 
tion. 

It should be clearly realized what Postulate I! implies. For this 
purpose let us imagine that the velocity of light is measured in the 


Einstein Postulates. Lorentz Transformation 39 





frame K in the so eine way. At the moment ¢, a light signal is 
sent from the point x, along the x axis. It reaches the point x2 at 
the moment to. Then c = (x2 — x1)/(t2 —t,). Now the same two 
events, that is the sending and reception of a signal, are viewed 
from the frame K’. The sending of a signal occurs for an observer 
from the frame K’ at the point x; at the moment /| and the re- 
ception at the point x2 at the moment 4. In spite of the fact that 
the frames K and K’ move relative to each other along the com- 
mon axis x, x’, we have to get the ratio (x3 — xj)/((/—4) equal 
to c. From the viewpoint of the “common sense” this must not be 
the case. (This becomes clear if one draws the diagram of the 
experiment.) However, this is exactly what Postulate II prescribes. 

We have formulated Postulate Il, in fact, the way it was done 
by Einstein himself in his article of 1905. However, in our time 
it is advisable to formulate it otherwise, namely, to proceed trom 
the assumption that there is the ultimate velocity of signal trans- 
mission in nature. This is the principal assumption. Then this 
ultimate velocity is identified with the velocity of electromagnetic 
waves, i.e. light, in vacuo. The last assumption is not obligatory: 
basically, the STR would not have lost its meaning if the ultimate 
velocity had turned out to be different. However, “the STR makes 
use of just this assumption. If one assumes that the velocity of 
light in vacuo is the ultimate velocity at which an interaction can 
be transmitted, it follows directly that it must have the same 
magnitude in all IFRs (see § 1.9). —_— 

Having formulated the first principles of the theory of relativity, 
that is two Einstein’s postulates, one can formulate the general 
objective of the special theory of relativity. Its basis is the prin- 
ciple of relativity, i.e. the equivalence of all inertial frames of re- 
ference with respect to all physical phenomena. The theory of re- 
lativity has to give such a description of physical phenomena 
which will be the same in all inertial frames of reference. Thus, 
if we have some equations at our disposal describing one or 
another group of phenomena, these equations must appear alike in 
all inertial frames of reference, each frame using its own vari- 
ables. Recall that equations of mechanics and electrodynamics 
intrinsically contain coordinates of an event and a moment of its 
occurrence. These coordinates and the moment of time registered 
for an event are transformed on transition from one inertial frame 
of reference to another. The Galilean transformation changes the 
appearance of Maxwell’s equations, but since we want to preserve 
them as equations of an electromagnetic field correct in all iner- 
tial frames, we ought to find such a transformation of coordinates 
and time that will keep the appearance of Maxwell’s equations 
invariable. Such a transformation will turn out to be the Lorentz 
transformation. 
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The Lorentz transformation, however, follows directly from the 
Einstein postulates. The point is that the Maxwell theory was 
developed from the very beginning as the relativistic one. The in- 
herent cause for this consists in the fact that it described cor- 
rectly the properties of light, the most relativistic object of all. 

Thus, having found the transformation of coordinates and time 
satisfying the Einstein postulates, we have to be sure that the 
basic equations of physics are the~same in all inertial frames, i.e. 
covariant relative to this transformation. The meaning of the 
terin “covariant” will be explained in § 4.3. Now we have to dwell 
on the “basic laws” of physics. 

The laws of Newton are referred to as the basic laws in me- 
chanics, the Jaws of Maxwell as the basic laws in electrodynamics, 
and the equations expressing the first and the second principles 
as the basic laws in thermodynamics. 

Relative quantities were known in classical physics, eg. velo- 
cities, coordinates, velocity directions, but the special theory of 
relativity adds to them, rather unexpectedly for our intuition, re- 
lative time intervals between events and relative scale lengths, 
ie. distances. However, this is the “price” that we must pay in 
order to realize the principle of relativity with respect to all 
physical phenomena. 

And still the predominant feature of the theory of relativity, 
in spite of its title, is not at all the relativity of various quantities, 
i.e. their dependence on the choice of an inertial frame of re- 
ference. The essence of the theory of relativity consists in just the 
opposite. The theory of relativity shows that the laws of nature 
in inertial frames of reference do not depend on the choice of a 
reference frame and on a position and motion of an observer, but 
measurement results in different reference frames can be cor- 
related. Speaking in terms of philosophy, the theory of relativity 
underscores the objective character of the laws of nature and not 
the relativity of knowledge. 

Of course, trying to alter an historically established name, 
which by the way was proposed not by Einstein but by Planck in 
1906, is_a hopeless undertaking. However, there is one point to 
pay attention to. The controversy over the correct name for the 
theory, “special” or “partial”, is not essential. In a sense, the 
problem is how to restrict the theory to inertial frames of ref- 
erence. Essentially this restriction results in the theory which 1s 
correct in the absence of gravitational fields or, practically, in 
weak gravitational fields. That is why the most correct name 
would be the “restricted” theory of relativity, which is adopted 
in the French literature. 

Although the Einstein postulates are the first principles of the 
theory of relativity, they are not sufficient for its development. 
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The construction of a relativistic frame of reference is funda- 
mentally important for the theory, so we turn our attention_to 
this aspect now. 

“§ 2:2. The relativistic frame of reference. In the construction of 
a relativistic frame of reference, just as in the construction of the 
whole theory, the validity of both Einstein’s postulates is assum- 
ed. Besides, we shall assume that the velocity of light in vacuo 
is the aulbimiate velocity at which signals are transmitted. The last 
assumption Is not present in the Einstein postulates. However, as 
we shall see later on, it has lo be inevitably incorporated in the 
theory if we want the principle of causality to come. into effect 
(see § 3.4). In § 1.1 we spoke in detail on how a reference frame is 
constructed in classical mechanics. There we indicated that it was 
sufficient for each reference frame to have one clock, since it was 
assumed that infinitely fast signals could be used. But in the 
STR the existence of a finite velocity of a signal is explicitly 
allowed for, so that when the velocities.in question -get-closer to 
that ultimate one, it bec 
to use only one clock. But it is just these velocities that are of 
interest for the theory of relativity. 

Therefore,'a ‘a set of clockslis to be added to a coordinate system 
constructed exactly in the way described in § 1.1. Basically, the 
STR implies that clocks are located at every point of space. This 
is not needed in practice, but as a matter of principle a clock must 
be at any point where the moment of an event is registered. All 
the cleeks-of a given reference frame are motionless relative to it. 

It is assumed in the STR that it is possible Yo have at one’s” 
disposal as many ideal identical clocks as one needs. This as- 
sumption is easily realized in our time. According to quantum 
mechanics all microparticles of the same kind are identical. In 
particular, characteristic oscillation frequencies of atoms of the 
same kind coincide precisely. Taking the atoms themselves for 
the clocks and the periods of atomic characteristic oscillations 
for the time standards, we obtain a sufficient number of required 
clocks. 

Length standards can be dealt with in just the same manner. 
The wavelength of a characteristic radiation of a given atom can 


be chosen quite adequately as a length. unit. Even prior to the 
advent of quantum mechanics it was believed that the wavelength 
of a radiation of a given atom can be utilized as an invariable 
length standard: to wit, that is how the length of the metre was 
immortalized by Michelson at the beginning of the 20th century. 

When we consider two IFRs moving relative to each other, the 
length scales and the clocks of each frame are at rest only with 
respect to “their own” reference frame. Is it possible to believe 
that we have identical length scales and clocks in different IFRs, 
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if, say, there are such scales and clocks only in one frame? In 
some literature one can come across a discourse to the point that 
length scales and clocks can be transferred from one IFR into 
another. There is no doubt that one should not do this. Transfer- 
ring clocks and length scales from one IFR into another, we im- 
part acceleration to them. Theoretically, acceleration varies the 
length of scales and the clock rate. Here are some straightforward 
examples: drop a clock or a ruler on a stone floor. A clock may 
just stop, and a ruler break. Even an atomic clock breaks down 
when atoms get destroyed. All that is the effect of acceleration. 

But in order to obtain identical length and time standards in 
difierent IFRs, one does not need to transfer anything from one 
frame into another. It is sufficient to take a pure substance in any 
reference frame and its radiation will provide us with required 
standards. It should be emphasized how important it is to have 
length and time standards in each IFR which are truly identical 
with those in all other frames. I rincipJe of relativity 
and the equivalence of all IFRs in combination with the identity of 
Jength and time Standards make it possible to attain the complete 
identity of these reference frames. 

So, every IFR has as many adequate clocks as needed. The 
time of an event at a given point is the reading of the clock 
located at the point where the event occurred at the moment of 
the occurrence of the event. If two events occurred at different 
points in space and the clocks at these points registered the same 
time for the occurrence of these events, we have to regard these 
events as simultaneous. But obviously the synchronism of events 
occurring at different points in space depends on how the initial 
time readings of these clocks were adjusted, the clock rates being 
assumed absolutely identical. Thus the determination of the syn- 
chronism of events and the adjustment of the initial time readings 
of all the clocks belonging to a given IFR, i.e. the clock syn- 
chronization, are the same thing. It should be pointed out that the 
clock synchronization, that is the determination of the synchronism 
of events, can be accomplished in different ways. The advantages 
of the synchronization suggested by Einstein will be explained 
later on. All the same, it should be emphasized that the synchro- 
nism of events is determined, and this determination can be ac- 
complished not in a single way. 

Here is the example showing how important it is to know how 
to determine the synchronism of events. How is a velocity of a 
particle found? Let a particle move along the x axis. To obtain 
its velocity, one must know the position x, of the particle at the 
moment ¢, and also its position x, at the moment fy. Provided 
the motion is uniform the velocity is equal to (x2 — x;)/(t2— 4). 
But the arrival of the particle at the point x, is registered by a 
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cloch located at that point and the arrival at the point x. by a 
clock located at the point x2. To determine the velocity, one has 
to be sure that the clock located at the point x2 was showing at 
the moment ¢, the same time as the clock located at the point x. 
Only in that case the determination of the velocity would have 
any sense. But this just implies that the clocks must be synchron- 
ized. 

Having explained that the determination of synchronism and 
clock synchronization are the same thing, we pass over to the 
procedure of clock synchronization within one IFR. The first 
me to come to one’s mind is to collect the clocks at one spot, 
verify them and then return them to their respective points. Fol- 
lowing Einstein we shall reject this procedure, because every clock 
transfer is associated with an acceleration that the clocks gain. 
Theoretically, every acceleration affects a clock rate. Consequently, 
it is better first to set the clocks at their respective points and only 
then to verify them *. 

How can one verify, ie. synchronize, the clocks located at va- 
rious points in space? Let a clock which we shall call a reference 
one be located at the origin of a given IFR. Of course, this par- 
ticular clock does not differ in any detail from all others. One can 
send a signal from the reference clock to any clock of a given 
IFR. It is assumed in this case that the distances from each of 
the clocks to the reference one are known, with no clocks being 
necessary to determine_these distances. Knowing the velocity at 
which a signal is transmitted, one can find the time it takes a 
signal to travel from the reference clock to any clock of the frame. 
If the signal from the reference clock is sent at the moment t=0, 
a synchronized clock should display just exactly this time at the 
moment when the signal reaches this clock. Although generally 
speaking one can use any signal, it is most convenient to choose 
a light signal in vacuo for the purpose of clock synchronization 
in all IFRs, since it propagates at the same velocity in all IFRs. 
The utilization of a light signal in vacuo for clock synchroniza- 
tion is one more factor ensuring the complete equivalence of all 
IFRs. 

Thus, a synchronization “agent” is a light signal. Let us de- 
scribe now a synchronization procedure for a given IFR according 
to Einstein. 

1. Clocks are set at their respective points and actuated. Coor- 
dinates of points at which the clocks are located are known, and 


* There exists the voluminous literature “illustrating” that the infinitely slow 
transportation of clocks does not affect their rates. No doubt, it is plausible in 
terms of physics. But inasmuch as the relativistic theory deals with relativistic 
velocities and distances, a procedure of this kind is hardly of any interest to us. 
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so the distances from each of the clocks to the reference one are 
known. 

2. At an arbitrarily chosen moment f, a light signal is sent 
from the reference clock to the clock to be synchronized. The light 
signal travels in vacuo along the known path and its arrival is 
registered by an observer or by means of a device. 

3a. The reading of the clock at the moment of the signal arrival 
is to be set to ¢ = t, + r/c, where r is the distance to the reference 
clock. The “initial” reading of the clock is thereby chosen, the 
clock is “verified” against the reference one. 

One may also use another equivalent method. 

3b. Mirrors are set at all points where clocks are located in 
crder to reflect light back to its source. If the reference clock 
registers the return of the signal at the moment fo, the moment 
registered by the clock at the mirror is to be set equal to f= 
= ty + (ti + te) /2. 

The last procedure of clock synchronization has one delicate 
point. Using procedure 3a we presume the velocity of light c 
known. But we have already seen that two synchronized clocks 
are mecessary to determine the velocity of motion in one direc- 
tion. On the other hand, the velocity of light is usually determined 
from the motion of a beam along a closed path. In particular, the 
velocity of light could be found by means of reflection from a 
mirror, when only one clock is available. In this case one has to 
make use of procedure 3b and to know the distance from the rei- 
erence clock to the mirror. If this distance is equal to r, then 
c¢ = 2r/(te— t). However, if the velocity of light propagating 
“there” is not equal to that of light propagating “back”, we are 
not able to establish this fact. It is impossible to ascertain this 
fact experimentally just because our clocks are synchronized to 
give the value c for the velocity of light. The theory of relativity, 
however, proceeds from the assumption that the velocity of light 
in vacuo is the same in all directions. Besides, the totality of ex- 
perimental data does not contradict either this statement or the 
consequences of the theory of relativity. 

Thus, we have come to a relativistic frame of reference compris- 
ing a coordinate system of rigid axes and synchronized clocks 
fixed rigidly to this system. Such clocks in a given IFR will be 
referred to as a “set” of clocks. The synchronization procedure 
according to Einstein is such that it can be performed in the same 
manner in any IFR. 

In accordance with the adopted rule for clock synchronization, 
synchronism of events can also be determined as follows. Let two 
events occur at the points of space equally removed from the 
third point. If at the moment of the occurrence of the two events 
the light signals are sent from the points of events to that third 
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point, the events are assumed simultaneous if both signals reach 
the third point at the same moment of time. 

Of course, accelerated motion of bodies can be treated in the 
framework of the STR, whereas accelerated motion of reference 
frames, relative to inertial ones, cannot be considered. Since 
length standards and clocks are rigidly fixed to their IFR, it is 
clear that these standards and clocks should not be accelerated. 
Otherwise, the study of an influence of acceleration on length 
standards and clocks would make us examine their specific struc- 
ture and deprive the theory of its universal character. 

§ 2.3. The direct consequences of Einstein’s postulates (a few 
imaginary experiments). The two direct consequences of Ein- 


tivity of time intervals between events”, can be obtained directly 
from the postulates themselves. Most offen they are obtained 
from a transformation of coordinates and time of an event. This 
transformation is compatible with the Einstein postulates and is 
called the Lorentz transformation. However, this convenient me- 
thod to be discussed in § 3.2 is not at all obligatory. Now we 
shall describe a few “imaginary experiments” by means of which 
we shall draw necessary conclusions, Imaginary experiments play 
a conspicuous role in conclusions of the STR.Théy_represent some 
hypothetical éxperiments not to be necessarily conducted in prac- 
tice. In fact these are only generalizations permitting definite con- 
sequences to be obfained from the given premises.* Now we pass 
over to a description of several imaginary experiments whose re- 
sults we shall obtain once more when consequences of the Lorentz 
transformation are discussed. 

We shall begin with a very simple imaginary experiment which 
illustrates_the relativity of synchronism, provided the second 
Einstein postulate is satisfied. Later on we shall obtain the same 
result by different methods. The experiment is performed in tne 
Einstein train. This term is applied to any train movirig ufi- 
formly and rectilinearly at, preferably, a relativistic velocity. In 
an imaginary experiment one can assume even such a thing. Tie 
middle of the train is easy to find precisely. This is done in the 
train's reference frame and does not present any difficulties. Ob- 
server | is located in the fiddle of the train and observer 2 at 
the station. Light signals are sent to observer 1 from the ends 
of the train whicli are equally removed from him. The imaginary 
experiment is so performed that the signals travelling from the 


* One should not think that “imaginary experiments” (Gedankenexperimente) 
are characteristic only of the theory of relativity. The series of “imaginary 
experinients” devoted to quantum mechanics can be found in the discussion of 
N. Bohr and A. Einstein, UFN 66, 571 (1958). 
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ends of the train reach observer | just at the moment when he 
rides up to observer 2. In the imaginary experiments one does 
not usually take interest in how to accomplish this practically. 
It is essential to us what conclusions will be drawn by the two 
observers from the fact of a simultaneous arrival of the signals 
at the middle of the train. 

Observer I. The light signals must travel equal distances until 
they reach me. Consequently, they are sent simultaneously. 

Observer 2. The light signals reached me when the middle of 
the train was moving past me. Consequently, they were sent 
somewhat earlier. But “earlier” 
the head of the train__was 
ttoser.to..me than the tail. 
Therefore, the signal from the 
tail had to be sent slightly 
in advance in order to reach 
me simultaneously with the 
signal travelling from the head. 
Consequently, the signal from 
the tail was sent earlier than 


ylh 





Fig. 2.1. The “imaginary experiment” 
ermitting one to establish that the 
ength of the rulers, oriented at right 
angles to the direction of the relative 
niotion of coordinate systems, does not 
vary when measured in any IFR. 


that from the head. - 

It is evident from this plain 
reasoning that two simulta- 
neous events in one reference 
frame, that is the train’s frame 


in our example, are far from 
being simultaneous in another, which in our example is the frame 
fixed to the Earth. 

All subsequent imaginary experiments will be of quantitative 
nature. In all of them we shall be considering two IFRs designat- 
ed by K and K’ with their relative velocity directed along the 
common axis x, x’ (see Fig. 1.2). It is assumed that the Cartesian 
axes of the two frames coincide at the initial moment f= ¢ = 0. 

(a) A comparison of the lengths of parallel rulers oriented in 
the direclion perpendicular to that of the relative motion of two 
IFRs. Let us take rulers of the same length in each reference 
frame K and K’ and place them along the corresponding axes y 
and y’. The equal rulers BC and B’C’ are illustrated in Fig. 2.1. A 
and A’ are the middle points of the rulers in each reference 
frame. Let the rulers move so that when the axes y and y’ coin- 
cide, the middle points A and A’ also coincide. The frames K 
and K’ are geometrically identical at the moment ¢ = t’/ = 0. The 
question is: what quantity will be obtained for the ruler’s length 
B’C’ as a result of measurements by the observer from the frame 
K, and what for the ruler BC, when measured by the observer 
from the frame K’? The observers have to mark the positions of 
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the two ends of the rulers moving past them simultaneously in 
their respective reference frames. For the case considered here 
synchronism is conveniently established as follows. When the 
points C’ and B’ find themselves on the y axis, light signals are 
sent to the point A’. In the frame K’ the sections A’C’ and B’A’ 
are equal, the velocity of light c is the same, so that both signals 
will reach the point A’ simultaneously. Consequently, the points 
C’ and B’ will cross the y axis simultaneously in the frame K’. 
Exactly in the same manner the points C and B will cross the y’ 
axis simultaneously in the frame K’ as seen from the frame K. 
Now let us measure the length of the ruler B’C’ in terms of the 
frame K and the length of the ruler BC in terms of the frame K’ 
at the moment ¢ = ¢’ = 0, when the y and y’ axes coincide. In 
this case all four points C, C’, B, B’ find themselves on the com- 
mon y, y’ axis, and the observers in the two frames can compare 
their results. If it turned out that CB > C’B’, or vice versa 
C’B’ > CB, it would be possible to detect the difference between 
the reference frames K and K’. This is inadmissible due to the 
initial assumption about the equivalence of all inertial frames_of 
reference. That is why the observers from the frames K and K’ 
can Only certify that CB = C’B’. 

Consequently, the lengths (and the length units) oriented in 
a direction perpendicular to that of the relative motion remain 
constant when measured in any IFR. But this means that tire 
coordinates of points along the axes perpendicular to the motion 
direction also remain invariable. Thus, exactly like in the Gali- 
lean tranformation 


y=y, 2’ =z. (2.1) 


(b) The comparison of clock rates in the frames K and K’. Ob- 
serving clock rates in the two frames K and K’ moving relative 
to each other, one can only compare readings of one clock from 
one frame with readings of several clocks from another frame, 
because two clocks from different reference frames get together 
at the same point in space only once. In one of the frames there 
must be at least two clocks which are supposed to be synchronized 
in the way described in § 2.2 of this chapter. For the sake of de- 
finiteness we shall be comparing one clock from the frame K’ 
with two clocks from the frame K. 

Let a clock and a light source be located at the origin O’ of the 
frame K’ (Fig. 2.2a). A mirror is set on the 2’ axis at the distance 
z) from the light source (and the clock) in the direction perpen- 
dicular to that of the relative motion. A light signal is transmitted 
from the source to the mirror from which it is reflected back and 
returns to the point O’ in the time interval At’ = 2z/c. Both the 
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light source and the mirror are at rest in the frame K’ and the 
signal travels “there” and “back” along the same straight line, 
i.e. the 2’ axis. 

Now let us consider the propagation of the same signal in the 
frame K relative to which the source and the mirror move to the 
right together with the frame K’ at the velocity V. Although the 
signal was sent from the two coincident origins O and O’, the 
reflection from the mirror will occur at some other point x, of 
the frame K and the reception of the reflected signal at the point 
X2 of the x axis. In this way the path of the signal in the frame K 





Fig. 2.2. The “imaginary experiment” showing that the interval between two 
e7éents measured in terms of proper time is always less than the time interval 
beiween the same events registered by means of two clocks of any other refer- 
ence fraine. (The “light clock” experiment.) (a) Tie calculation of the proper- 
time interval between the sending and reception of a light signal at the 
origin O’ of the coordinate system (6) The calculation of the time interval 
between the same events in the reference frame K relative to which a light 
source and a mirror move. 


traces out the two sides of an equilateral triangle. As the path 
travelled by light in the frame K is greater than that in the frame 
K’, one can expect that the time interval Af between the sending 
and reception of the signal, when measured in the frame K, will 
be greater than At’. Indeed, the observer from the frame K will 
certify that the two events, i.e. the emitting of light from the 
point O’ and its return to the point O’, occur at the two different 
points of space O and B (Fig. 2.26). The time interval At between 
these two events in the frame K will be measured in this case 
by the two clocks removed from each other by the distance VA? 
along the motion direction. The velocity of light is equal to ¢ in 
all reference frames. Therefore, having divided the length of the 
lateral sides of the triangle OAB by the velocity of light c, we ob- 
tain the time interval At expressed implicitly: 


Mt=24/4+()'/c. 
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Finding At from the last equation, we get 


—Yy 
where lr = (1 _— “ 


Considering that 2) = 2, it follows that 


Rie OE ie ea (2.2) 


a/ ve 06 V1 — BP 
ary 
c 


where the designation B= V/c is adopted. Since both events 
occurred at the same point in the frame K’, they were registered 
by means of the same clock. A time_interval between events re- 


gistered by means of the same clock (which implies that the 
-events~occurred at the same point of space) igrraterted 10 as a 
proper-time interval between these events. Of course, a time in- 
“terval the initial and the final moments of which are registered 
at different points of the reference frame and, consequently, by 
means of different clocks will not be a proper-time interval be- 
tween events. In the example just examined the proper-time in- 
terval is equal to Af’, It is seen from Eq. (2.2) that a time in- 
terval between events is the least when it is determined in Sich 
@ reference frame where these events happen at thesame point 
Wn Space. As we shall see in § 3.4 it is’ possible to indicate the 
‘conditions providing the existence of a reference frame in which 
two given events occur at one point. 

Thus we have drawn the most important conclusion: _a_time 
interval between two events is a relative quantity; iis vale 
depends—en-the clidice of_a reference frame. Nothing of the sort 
was ever known in classical physics, where time intervals pos: 
sessed absolute properties. - 

This example effectively illustrates that time readings them- 
selves must be different in different frames. When the origins O 
and O’ coincided, the clocks from the frames K and K’ located 
at this point registered, according to our condition, the moments 
t; =0 and ¢{ =0. When the light signal returned to O’, the 
clock from the frame K’ registered the time &4=,-+ Ar’. But at 
the same moment and at the same point there is the clock from 
the frame K. This clock is not the one located at O, but another 
one synchronized with it. Its reading will be t. = ¢; + At. As we 
have already established At + Af’, i.e. the clock readings are 
different. This just means that times of events are registered dif- 
ferently in different reference frames. Note that this calculation 
of clock readings in the frame K corresponds completely to the 
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rule of clock synchronization proposed by Einstein and described 
in § 2.2. 

(c) A comparison of the lengths of rulers arranged parallel to 
the relative velocity direction. A proper frame of reference with 
respect to a given object is such a frame in which this object is 
at rest. It is customary to designate such a frame by K®. Let us 
suppose now that a ruler positioned along the x° axis is at rest 
in this frame. Let us designate the length of the ruler in this 
frame, i.e. the proper length of the ruler, by fo. To find the length 
of the ruler in any reference frame, one has to determine the coor- 
dinates of the ends of the ruler simultaneously in this frame. One 





Fig. 2.3. The “umaginary experiment” which permits detecting the “contraction” 
of the ruter's length when ineasured in a reference frame in which the ruler 
moves uniformly and rectilinearly The ruler is oriented in parallel! with its ino- 
tion velocity. (a) The measurement of the length of a ruler at rest in the 
frame Ko (ihe proper length of a ruler). (6) The measurement of the ruler's 
iength in the reference frame relative to which the ruler moves at the velocity V. 


does not care about the simultaneity of measurements only in the 

amie in which the_ruler rests. Since in everyday life we tiea- 
sure the proper length of objects, the procedure of length mea- 
surement is simple and can be performed by means of a direct scale. 
transposition. _ a 

A ruler is at rest only in one unique reference frame. In all 
other inertial frames of reference moving relative to one another 
the ruler moves, and the direct transposition of a unit scale be- 
comes impossible. Let us resort to the method of-length measure- 
ment whict-ts afso suitable for measuring the length of a ruler 
moving relative to a reference frame. 

Let us place the ruler’s left end at the origin Oo where the 
light source / is also located. At the ruler’s right end the mirror S 
is fixed perpendicular to the x9 and x axes (Fig. 2.3a). Now let 
us consider the following two events. The first event: a light 
signal is sent from the source / toward the mirror along the x5 
axis at the moment ¢ = f) = 0. The second event: having reflected 
from the mirror S, the light signal gets back to the ruler’s leit 
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end at the point Oo. Both events are registered at the point Oy 
by means of one clock. So the time interval between the events 
is the proper-time interval Ato which can obviously be written 
down as 


At =—. (2.3) 


The same two events look somewhat different to the observer 
from the frame K (Fig. 2.36). At the moment when the signal is 
emitted the source / in the frame K is positioned at the point O 
and the mirror S at the point S,. By the moment of reflection the 
mirror will be shifted to the point S, and the source / to the 
point /,. By the moment of the arrival of the signal reflected from 
the mirror at the ruler’s left end the source will already be shifted 
to the point />. The moments of time corresponding to the first 
and second events are registered in the frame K at different 
points and, consequently, by means of different clocks. This means 
that the time interval At between these events can be expressed 
in terms of Af according to Eq. (2.2). When light propagates to 
the right, the velocity at which it overtakes the mirror S is equal 
to c—V, according to the classical velocity summation in the 
given IFR. When light propagates to the left, it moves toward the 
mirror at the velocity c+ V. Designating the ruler’s length, un- 
known so far, by / in the frame K, we obtain the time in which 
light gets from the source to the mirror, t; = l/(c — V), and the 
time in which light gets from the mirror to the source, to = 
= //(c + V). Therefore, the time interval between the sending 
and reception of the light signal in the frame K is 


{ rf 2 1 
Meir ely Psa oe Oe 


o|~ 


Recalling that Afp=~/1 —B? At, and taking into account Eq. 

(2.3), we obtain from the last equation 
— B? ge $$ —_—_—. 
p= SOP) yy oe TP Hh VI-B=t. (2.4) 

Eq. (2.4) gives the length of the ruler when measured in any 
inertial frame of reference. In the frame in which the ruler is at 
rest (B = 0) its length is equal to 4. It is just the fact from 
which we started our reasoning. Eq. (2.4) is asymmetric with 
respect to the lengths / and Jo, since it relates the proper length /p 
of the ruler in the frame K° to the improper length / in any other 
reference frame K. 

Thus we have ascertained the relativity of time intervals be- 
tween events and the relativity of the ruler’s lengths or scales 
directly from Einstein’s postulates, the quantities, which, in clas- 
sical mechanics, were equal in all inertial frames of reference. 
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These results are inherent in the theory of relativity and require 
a comprehensive discussion. However, we shall postpone the dis- 
cussion of the results obtained till §§ 3.2, 3.3, since these results, 
because of their significance, will be again derived by several 
methods to reveal some new circumstances essential to the inter- 
pretation of Eqs. (2.2) and (2.4). 

§ 2.4. The relativity of synchronization of clocks belonging to 
two inertial frames of reference. The direct derivation of the Lo- 
rentz transformation. Up to now we have been considering syn- 
chronization of a set of clocks belonging to a given inertial frame. 
But all inertial frames are equivalent and any event can be reg- 
istered by an observer located in any inertial frame. An ob- 
server marks the coordinates of the event in his own coordinate 
grid. The set of clocks in every inertial frame of reference registers 
the time of the event by means of the clock located at the moment 
of the event at the point in space where this event occurs. 

Speaking figuratively, all space is filled up with moving clochs 
belonging to different reference frames and a momentary liglit 
flash at a given point in space, illuminating the dials of all clocks 
located at it, makes it possible to determine the time of the 
event, i.e. the light flash, in all reference frames whose clocks 
were at that point at the moment of the light flash. In order to 
make out what happens here, it is sufficient to consider the two 
frames K and K’. 

The question is what the clocks of the two frames K and K’ will 
show when they find themselves at one point. Of course, if we 
want to compare the readings of the clocks from different frames, 
a certain relationship should be established between the readings 
of corresponding sets of clocks. The comparison is meaningless 
without such a relationship. It should be recalled here that all the 
clocks belonging to each of the frames are synchronized. 

It turns out that all one should do is to synchronize only the 
two clocks, one of the frame K and the other of K’, which get to- 
gether at a given moment. Having synchronized this pair of clocks, 
we thereby reset the readings of the remaining clocks in each 
frame. Then it turns out that the clocks of the frames K and K’ 
show different time at all other points in space. This is a very 
significant result: the clock synchronized in one reference frame is 
dis-synchronized in terms of any other inertial frame of reference. 
In other words, if the readings of all the clocks belonging to the 
frame K are simultaneously fixed in the frame K’, it will be seen 
that all these clocks show different time in the frame K. Now we 
shall derive the requisite equations. 

Usually a relationship between the sets of synchronized clocks 
in the frames K and K’ is established as follows. When the origins 
of the frames K and K’ coincide, the clocks of K and K’ located 
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at the common origin are set to the marks ¢ = 0 and ¢t/ = 0. As 
we shall soon see, it does not follow from this that the clocks — 
from the frames K and K’ show the same time at all other points 
_in_space._. 

We shall need a formula for coordinate transformation for 
points in space on transition from the frame K to the frame K’. 
When the origins of coordinates coincide, the coordinate grid of 
the frame K’ is contracted I1/I times in terms of the frame K. 
The proper unit scales are assumed to be the same in the frames 
K and K’. Consequently, at the initial moment the coordinates x 
and x’ are related by the ratio (see Eq. (2.4)) 


rat (P= =r): 


By the moment ¢ the coordinate grid of the frame K’ will 
shift as a whole by the distance Vé, so that we obtain x = x’/f.' + 
+ Vt at that moment. Therefore, if the coordinate of the point 
in the frame K is equal to x at the moment ¢, its coordinate x’ in 
the frame K’ will be equal to a 


ie D=P ee vo. _\ (2.5) 


The coordinate grid does not vary along ‘the y andz axes (§ 2.3), 
and because of this 


yy, 2’ =z. 


Now we are interested in the reading of the clock of the frame A’ 
located at the point x at the moment ¢. Let us designate this read- 
ing by é’(x, ¢). This quantity can be defined by many methods, 
but now we shall obtain it using a clock synchronization proce- 
dure. 

We shall do the following: when the origins of coordinates O 
and O’ coincide and the readings of the two clocks, one from K 
and the other from K’, are equal to zero, a light signal is sent 
along the common x, x’ axis in the direction of the growing 
values of x, x’. Next, we consider the moment ¢ by the clock of the 
frame K. At this moment the signal arrives at the point x. = ct 
of the frame K. The arrival of the signal at the point x2 at the 
moment ¢ represents the event with the coordinates (x2, f) in the 
frame K. In the frame K’ the same event will have the coordinates 
(x3, 12), with x3== ct? according to the second postulate. But Eq. 
(2.5) ts valid for all events, so substituting x2 and x2 into its 
left-hand and right-hand_sides-and cancelling by c, we get 


}a—=re(—B). | (2.6) 
This means that a clock belonging to the set of clocks of the 
frame K’ shows at the point x2 the time 4 which does not at all 
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coincide with the time ¢ showed at the same point by a clock of 
the frame K. Thus we have different time readings; this fact has 
already been discussed in § 2.3. 

Now we can find the reading of still another clock of the 
frame K’ at the moment ¢. The origin O’ will get at the point 
x, = Vt at the moment ¢. The reference clock of the frame K’ will 
also shift to this point together with the origin. During its 
shifting it will register the proper-time interval At’ =f; —0=f. 
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Fig. 2.4. The dis-synchronization of the clocks of the frame K’ in terms of the 

frame K When the origins O and O’ coincide, two clocks of the frames K and 

K’, which happen to be at this point, are set so that their readings are ¢ = 0 

and t/ =0 At the moment ¢ (by the frame K clock) the readings of the frame 
K’ clock can be found at the points x, = Vt and x, = et. 


Meanwhile the time interval between the moment when the origins 
O and O’ coincide, and the moment when the origin O’ shifts to 
the point x,, is equal to At = t—0=1, when measured by the 
clock of the frame K. According to Eq. (2.2) 


h=er. (2.7) 


Thus we have come to the conclusion (Fig. 2.4) that at the 
moment ¢ (by the clock of the frame K, i.e., simultaneously in the 
frame K) the clock of the frame K’ shows different time when 
located at different points in the frame K: 


at the point x,— cf =P (1 —B)t, 
at the point x,=Vt ft =¢/I. 


And all this in spite of the fact that all the clocks from the set 
of the frame K’ have been synchronized within their own frame. 
But the calculation shows that all these clocks are dis-synchron- 
ized in the frame K. We have also found that a dis-synchroniza- 
tion depends on what point of the frame K is selected for clock 
comparison. Let us find the difference of readings of clocks of 
the frame K’ at the points x2 and x: 


Af =t; — {| =PBr(B— 1). 
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This difference of readings accumulates at the distance Ax = 
= X,— x, = ct(l1—B). Having assumed that a dis-synchroni- 
zation depends continuously on the distance along the x axis, one 
can determine a dis-synchronizati er_unit length: 





(2.8) 


It is seen from Eq. (2.8) that a dis-synchronization per unit 
_length does not depend on the choice of a moment ¢, but is de- 
termined only by the distance between the clocks of the frame K’ 
as measured in the frame K. 
Now one can get for an ar- 
bitrary pair of points 


v2 B 
6—t = T(x, —x,). 


We have already mention- 
ed that sets of synchronized 
clocks of the frames K and 
K’ are adjusted to each other 
by setting clocks to the zero 
reading at the point x, = 0 
at the moment when the co- Fig. 2.5, The frame K’ clock readings at 
ordinate systems of the the moment ¢ = 0 (by the frame K clock) 
frames coincide. In othcr at the points of a coordinate x. 
words, the values ¢ = 0 and 
t’ = 0 are ascribed to the readings of the clocks located at that 
point. Substituting +. = x, we obtain from the last equation 





lie (20) ac Pox. (2.9) 


It is seen from Eq. (2.9) what the clock of the frame K’ located 
at the point x will show at the moment ¢ =0 (by the clock of 
the frame K). Its readings are illustrated graphically in Fig. 2.5. 
_A clock of the frame K’ outpaces a clock of the frame K, when 
located to the left of the origin, and lags behind it to the right 
“of the origin, = 7 "oe ae 

Now it is not difficult to ascertain what a clock of the frame 
K’ will show at the moment ¢, when located at the point x. We 
shall take advantage of the fact that the difference of the readings 
of two clocks of the frame K’ does not depend on the choice of 
the moment f. The clock of the frame K’ which was at the point 
x— Vit at the moment ¢ = 0, and according to Eq. (2.9) lagged 


behind the reference clock by the period re. Vi), will get at 


the point x at the moment ¢. This clock will always lag behind 
the reference one by this period. But at the moment ¢ the re- 
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ference clock will show the time (7=</f (see Eq. (2.4)), while 
the clock jecated at the point x will show the time 


7 (e p= p—-TEe—voar( 2 ESy. } @.toy 
= ZT 


Eqs. (2. 5), (2.6) ai and | (2.10) constitute the Lorentz transforma- 
tion. OF course, the derivation of Eq. (2.10) may seem clumsy and 
even superfluous. Indeed, applying the reasoning that has led us 
to Eq. (2.5) to the transition from the frame K’ to K, we obtain 


x(x’, MV = (x’ + VP’). (2.11) 


Solving Eq. (2.11) with respect to ¢’ and substituting x’ ac- 
cording to Eq. (2.5), we immediately get Eq. (2.10): 


, | x > | x B 
r=p{e—x}a7{e—roe—v}=r (1-4). 
Having shown the relativity of clock synchronization, we cleared 
up the physical meaning of different readings of clocks in different 
inertial frames. Besides, the understanding of “clock dis-syn- 
chronization” permits many baffling questions to be avoided. In 
conclusion, note that the point at which the readings of the clocks 
of the frames K and K’ coincide, moves continuously along the 
positive x axis at the velocity to be found from Eq. (2.10) by 
substituting ¢ = ?¢’ in it: 
r-1 
TB 


§ 2.5. The Lorentz transformation as a consequence of Ein- 
stein’s postulates. Let us derive the equations defining the coor- 
dinates (x’, y’, 2’, t’) of an event in the frame K’ from the coor- 
dinates (x, y, z, ¢) of the same event in the frame K. These equa- 
tions performing the transformation of the coordinates of the 
svent must comply with Einstein’s postulates. 

From the fact that space and time must be uniform in all ref- 
erence frames it follows that the relationship belween tte-coor- 

inmates of the even i st be linear In- 
“Seed, Tet the origin of the coordinates and the time be changed, i.e. 
the transformation x= + xo, y=G + Yo, Z=Z+ 2, t=i+ ty be 
performed. If the relationship between the coordinates of the event 
is linear in the frames K and K’, we obtain for x’, for example 


x’ =ax-+ ay + agz + at = 
= ak + eG + O32 + agt + (aixXo + Ayo + ay20 -F Aglo), 
where 4, a, @3 and a, are constants. From the last equation it 


is seen that the origin has shifted in the frame K’ as well, since 
the expression in parentheses is the same for all points of the 


Jee 
= 





c. (2.12) 
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frame K’. However, such a shifting is immaterial due to the uni- 
formity of space and time in all IFRs. 

Now let us introduce into the transformation equation at least 
one second-order term: 


x = Bx? = OX? + WD Ex +... 


Then the second term in the third link of this equaton will depend 
on xX and give rise to a distortion or deformation of space. This 
cannot be tolerated. Consequently, the transformation -to_ be_de-. 
rived is linear. A 

——~We shall make use of the arrangement of reference frames il- 
lustrated in Fig. 1.2. The relative velocity of the frames is directed 
along the common x, x’ axis with the y and z axes parallel to 
the y’ and 2’ axes respectively. At the moment ¢t = 1’ = 0 the 
coordinate systems coincide. The velocity of the frame K’ relative 
to the frame K is equal to V. The x axis results from the inter- 
section of the planes y = 0 and z= 0. So if the x and x’ axes 
coincide, then y’ = 0 and 2’ = 0 due to the condition y = 0 and 
z = 0. Thus, the transformation equations for the variables y and 
z must have the following form: 


y =Az+ By, 2'=Cy+ Dz. 


Here A, B, C and D are constants. Since spatial rotations of co- 
ordinate systems are inessential for a description of physical phe- 
nomena, one can ensure, through rotation of the y’ and z’ axes 
around the x’ axis, that the plane y = 0 is transformed into the 
plane y’ = 0 and the plane z = 0 into the plane z’ = 0. Thus, 
one can put y’ = By and z’ = Dz. However, since the directions 
y and z are equivalent, i.e. space is isotropic, and the relative 
velocity of the reference frames is directed along the x, x’ axis, 
it should be true that B = D. Hence, 


; y’=Dy, 2’ =Dz. 


It remains to determine the coefficient B. Let us consider a unit 
ruler located in the frame K along the y axis with the coordi- 
nates of its ends being y; = 0, yo = 1. These coordinates would 
be y;=0, ys=D in the frame K’, and its length would be 
l’ =y, — y, =D. If we took a unit ruler located in the frame K’ 
along the y’ axis (y;=0, y,=1), the coordinates of its ends in 
the frame K would be y: = 0, yo = 1/D and the length would be 
l=y—y = 1/D. 

Thus, when measuring a unit ruler of the frame K, an observer 
from the frame K’ will find its length equal to D, while an ob- 
server from the frame K measuring a unit ruler in the frame K’ 
will find its length equal to 1/D. Since all inertial frames are 
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equivalent, such a result cannot be tolerated unless D= 1. 
Therefore, 


y=y’, z=2', 
just as it was directly obtained from Einstein’s postulates (see 


§ 2.3). 
Now let us derive the transformation equations for the variables 
x and ¢. Since the transformation is linear, 


x =Ayx+ Ajot + Ato, (*) 


and vice versa 
X = Ay x! + Aool’ + Aap, 


where all coefficients A are constants. From the initial condi- 
tions ¢ = 0 and ¢t’ = 0 when the origins O and O’ coincide. Hence, 
Ato = 0 and Axo = 0. When observing the point O’ we can say 
that its coordinate x is equal to Vé at the moment ft. So from Eq. 


(*) we get 
0= A, Vt + Ajo; 


consequently, Ajo/A1,; = —V. Designating * Aj. by I’, we can 
rewrite Eq. (*) in the form (see § 2.4): 


x’ =I" (x — VA), (2.13) 
and from the analogous reasonings 
x=T (x + V0). (2.14) 


Thus, the problem has reduced to the determination of the coef- 
ficients [ and I’. Due-to the uniformity of time and space and {a 
the i these coelliclents can only depend on 
the-abselute-value of the yelocity V.=--——_____" 

It is easy to see that Tr = "Indeed, let the scale located along 
the x axis in the frame K have the proper length /o. If one of its 
ends is placed at the origin of the frame K, the coordinates of its 
ends will be x; =O and x, = Jy respectively. According to Eq. 
(2.14) xf =0, 42=1/F at the moment ¢= ?t’ = 0. (Recall that 
the two frames coincide geometrically at that moment.) Conse- 
quently, the length of the scale in terms of the frame K’ is equal 
tol’ =x;— xj =1,/T’. Let us take the scale of the same length 
fixed to the frame K’ and also located along the x axis. Then the 
coordinates of its ends will be xj =0 and x;=J . But in terms 
of the frame K these coordinates will be x; = 0 and xo = (,/I” at 
the moment ¢=0 according to Eq. (2.13). Consequently, its 


* It will be shown very soon that the quantities [ and I’ introduced here are 
equal and coincide with the quantities F used in Eqs. (2.2) and (2.4). 
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length is equal to / = xp — x; = ()/I”’, i.e. the scale is contracted 
I” times. Since the frames K and K’ are equivalent and their rel- 
ative velocity is the same, the contraction must be identical and, 
consequently, fr = I’. 

Now let us define the quantity [. The difference between the 
frames K and K’ lies only in their relative motion, and [ can 
depend on the absolute value of the velocity V alone. Let_us take 
advantage of the postulate on the invariant velocity of light in 
“tacuo in aTFRs Suppose, at the moment ¢t = t’ = 0, when the 
origins O and O’ of the two frames coincide, a light signal is sent 
from the common origin. Let the event consist in the arrival of 
the signal at some moment (¢ in the frame K or ¢’ in the frame 
K’) at some point (x in the frame K or x’ in the frame K’) 
located at the x axis. In the frame K this point has the coordinate 
x = ct, whereas in the frame K’ the same point has the coordi- 
nate x’ = ct’. These times and coordinates are interrelated by 
Eqs. (2.13) and (2.14), so that substituting these expressions 
for x and x’ in Eqs. (2.13) and (2.14), we obtain 


ct’ =Nt(e— 7), ct=Pt (c+ V). 

Multiplying term-by-term the left-hand and right-hand sides 
of these two expressions and cancelling them by ¢t’, we get 

gS Vv 

(ve t= BE ¢ 

or 


We have thus seen that this quantity [ coincides with the 
quantity [ which first appeared in Eqs. (2.2) and (2.4). 

In order to find a transformation equation for the variable ¢ 
we shall define ¢’ from Eq. (2.14), taking into account Eq. (2.13): 


/ x x’ x P(x —Vt) 


=a tw ttt eae - f= 
=r(1-4 x). 


Thus, we finally obtain the transformation equations in the 
following form: 


Y=C(x—-V), y=y e=z, ¢=P(t—2x). (2.16) 


r= (2.15) 


Eq. (2.16) is referred to as the Lorentz transformation. It is 
easy to rewrite it for an arbitrary direction of the relative ve- 
locity V of the frames K and K’. Indeed, we know that coordinates 
change in a motion direction and remain invariable in a direction 
perpendicular to motion. Let us resolve the radius vector r of the 
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point into components: one parallel to the motion direction ry 
(r | V) and another perpendicular to it ry (ry 1 V): 


r=r,+nr,. 
Then 
V 
r=l(r,—V), ro=ry, f=r(r—) 
but ; 
r, =va, rV=rV. 
Therefore 


rent =lr,—V) tr, =l((r,tr,)-V4—- 


—(I-—TI)r,, 
rv _[VirV)] - 


=r—r=r—Vi=— Pp 


ae 1 Vv? 


and, consequently, 
r’ =T(r —V)N+ (0 —1) hada 


f=r(1—). 


This is the Lorentz transformation in a vector form for an 
arbitrary direction of the relative velocity. The equation for r’ 
corresponds to the classical Eq. (1.1) and transforms into it 
when [ = I. 

Once again we shall postpone discussing the meaning of the 
Lorentz transformation ,(Eq. (2.16)) till we derive it by still 
another method. That method will lead us to a realization that 
the real physical world in which all phenomena of nature occur 
is a four-dimensional manifold, the so-called space-time. The 
special theory of relativity will appear before us as the theory of 
“four-dimensional space-time, as well as the theory possessing an 
Obvious geometrical meaning. Due to its physical scope-and a 
“possibility of a further generalization such an approaet proved 
to be of extreme importance to our whole vision of the world and 
the first step toward the creation of the theory of gravitation. 

§ 2.6. The propagation of the light wave profile. An interval 
between events. Let us conduct another imaginary experiment, 
considering it in terms of two IFRs, K and K’, the frame K’ 
moving along the common x, x’ axis at the velocity V in vacuo. 
At the initial moment ¢ = ¢’ = 0, when the origins O and O’ 
coincide, a light flash is triggered. According to the second Ein- 
stein postulate light propagates in all directions in the frames K 
and K’ at the same velocity c. Consequently, the wave profile, i.e. 
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the surface of equal phases, will look like a sphere in each of the 
frames K and K’. The equation of this sphere can be easily written 
down: 


In the frame K In the frame K’ 
PtP terse | x? py?+2?=— ct”, 


Even if we forget everything that was spoken about different time 
readings ¢ and ?¢’ in the frames K and K’, we still can explain 
now why we wrote ?¢’ instead of ¢ for the frame K’. Let us sup- 
pose the time in the frames be equal, i.e. ¢ = ¢’. Then the radii 
of the spheres turn out to be equal at a given moment ?¢. Thus, 
the same physical object, the wave profile, is equally described by 
the two spheres of equal radii with their centres located at the 
two points O and O’. This is an absurdity. Hence, one cannot 
assume f == f’. Let us put down the equations in the form 


cf — (x2 + y? + 2) =0, 
ct’? — (x? + y? +2) = 0. 


In this imaginary experiment we deal, in fact, with two events. 
The first one consists in sending a signal from the origin xo = 0, 
Yo = 0, 29 = 0 at the moment f) = 0, and the second in the arrival 
of the signal at an arbitrary point of the sphere having the coor- 
dinates x, y, 2 at the moment ¢. If one makes up the expression 


a/c (t — 9)? — (x — x9)? — (y — yo)? — (2 — 2)? = 


that is referred to as the interval between these two events and 
designated by s, the result obtained can be formulated as follows: 
the square of the interval between the two events, consisting in 
sending a signal from one point and its arrival at another, must 
be equal to zero in any reference frame: 


s'=s0, s?=0. (2.17) 


Of course, the interval between the events can be found not 
only for the sending and arrival of a light beam. If the coordi- 
nates of Event | are defined by the numbers x, y, 21, f; and the 
coordinates of Event 2 by the numbers %, yo, 20, tf, the interval 
Si. between these events is equal to 


The interval s\. for arbitrary events, however, is not equal to zero. 
Frequently it is convenient to consider events occurring at 
infinitely near points and at infinitely near moments. Assuming 
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in this case tg —t;=dl, x2— x)= dx, yo — yi=dy, 22 — 2;=dz, 
we obtain the interval squared in the form 


ds? = c?dt? — dx? — dy? — dz’. 


As we shall illustrate now, the basic property of the interval 
between events is its invariance on transition from one inertial 
frame to another. 

According to Eq. (2.17) it follows from this imaginary experi- 
ment, that is the sending and reception of a light signal, that if 
ds? = 0 in one IFR, ds’? = 0 in any other. Both ds and ds’ are 
infinitesimal quantities of the same order and consequently must 
be proportional to each other. Therefore, one may put down 


ds? =ads”, 
where a is a proportionality factor. This relationship must be 
valid for the interval between any pairs of events. Indeed, there 
are no conditions imposed on the relationship between the in- 
tervals ds and ds’ for a pair of arbitrary events. As to the special 
events, that is the sending and reception of light signals, the re- 
lationship has to be just like it is shown above. 
The coefficient a cannot depend on the coordinates x, y, z and 
the time ¢. Otherwise it would mean that different points in space 
and different moments of time are not equivalent. Since we regard 
space and time as uniform, a must be a constant depending only 
on the absolute value of the relative velocity of the two IFRs 
under consideration. Indeed, the coefficient a cannot depend on the 
direction of the relative velocity of the two IFRs, because other- 
wise it would imply inequivalence of different directions in space. 
Due to the isotropy of space we have to presume that a can only 
depend on the absolute value of the relative velocity of the iner- 
tial frames of reference in question. 
Let us consider three IFRs, having designated them K, K’ and 
kK’ respectively, with V, being the velocity of the frame K’ re- 
lative to K and Vz the velocity of K’’ relative to K. We can write 
that 
ds’ =a(V)) ds) («) 
ds’ =a (Vz) ds}. (#4) 

Considering directly the frames K’ and K”, one can write that 
dsj =a (Vis) ds» 


where Vio is the absolute value of the velocity of the frame Kk’ 
relative to the frame K”’. Substituting the last expression in Eq. 
(*) and comparing it with Eq. (**), we find that 


V 
ayaa Vu). (#4) 
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Since Vj. depends not only on the absolute values of the vectors 
V, and V2 but also on the angle between them (which does not 
enter explicitly in the last equation), this relationship can be 
evidently satisfied only when the coefficient a is reduced to a con- 
stant value. From the last equation it is clear that the constant a 
can be equal only to unity. Hence, 


ds? = ds”. 


from the equality of infinitesimal intervals it follows that 
s= 5", 


i.e. the interval @ is invariant with respect to the transformation 
of coordinates and time, complying with the Einstein postulates. 
(Note that the intervals s and s’ cannot differ by an arbitrary 
constant, since from s = 0 it follows that s’ = 0.) We have al- 
ready seen and shall make sure again that such a transformation 
is the Lorentz transformation. 

Thus, the expression c?/? — x? — y? — 2? must remain invariable 
on transition from the frame K to K’. When the frames K and K’ 
are arranged the way it is shown in Fig. 1.2, then y = y’, z= 2’ 
and the sum y?-+ 2? becomes an invariant. In this case the ex- 
pression 


r= CP — 2 (2.18) 


will be, in fact, the transformation invariant. 

§ 2.7. The Lorentz transformation as a consequence of the in- 
variance of the interval between events. In the previous section 
it was shown that the coordinates of two events must satisfy thie 
equation 


cr? — x2? = CP—x, or x2?—ct#?2—x?—c't? = (2.19) 


on transition from one inertial frame of reference to another. Here 
we suppose for the sake of simplicity that one of the events has 
the coordinates (0, 0, 0, 0), and due to our agreement about re- 
ference frames it means that this event also has the coordinates 
(0, 0, 0, 0) in another frame. 

Let us take one more step to simplify the notation. The reader 
has evidently already noticed how common is the product of the 
velocity of light by the time ct. Let us introduce a new time unit, 
a light metre, which is the time interval that light requires to prop- 
agate over the distance | m. Obviously, ! m of time = 1 m/s, i.e. 
1 m is covered in I/c seconds. 

Light travels + metres in the time t-1/e =? s. Hence, it is 
clear that 


t (light - m) = ct (s). (2.20) 
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This unit will not appear exceptional if one recalls that in as- 
tronomy distances are measured in terms of time (and the velocity 
of light), that is in light years. 

So, if time is measured in light metres, the expression for the 
invariant interval between events becomes quite simple: 


x?— 7? = — (2.21) 


The easiest way to find a transformation satisfying Eq. (2.21) 
is as follows. We know from § 2.5 that a transformation of coor- 
dinates and time must be linear. Let us write down such a trans- 
formation using indefinite constant coefficients in the form 


x’ =a,x + byt, 
v= Q2QXx + bot. (2.22) 


Substituting Eq. (2.22) into the left-hand side of Eq. (2.21) 

and grouping the coefficients at x?, t? and 2xt, we obtain 
x2 — = x? (aj — a3) — +? (63 — 6?) + 2xt (a,b, — a,b.) = x? — 1, 
(2.23) 


The last link of this equation is written down according to Eq. 
(2.21); it must be identically satisfied for any x and +t. This re- 
quires, however, the following equations to be complied with: 


aj—a—=1, 6 —O—=1, ab, —ab.=0. (2.24) 


These equations are very easy to satisfy, having assumed the 
coefficients a and 6 equal to hyperbolic functions (defined and 
described in Appendix 1, § 9): 


Qa = cosh 61, a2 = sinh 6,, 6, = cosh 62, 6, = sinh 02. 


In this case the first two equations of (2.24) are satisfied auto- 
matically. It follows from the third equation, rewritten as a;/a, = 
= 6,/6,, that tanh 0; = tanh 62, and in order to satisfy this equa- 
tion it is sufficient to assume 0; = 6, = 6. Thus, the transforma- 
tion (2.22) takes the form 


x’ =x cosh 6+ tsinh6, 

t’=xsinh@+tcosh 8. (2:20) 
The parameter @ can depend only on the relative velocity V. It is 
referred to as a velocity parameter and plays an important role 


in the STR (see § 3.5). To define it, one should use the first 
equation (2.25). Assuming x’ = 0 for the origin O’, we obtain 


= = —tanhé. (2.26) 
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Hlowever, the left-hand side of the equation, when written in 
conventional time units, appears as x/ct = B, since x/t at the 
origin O’ is just equal to the velocity of the frame K’ relative to X. 
So, we have found the relationship between the parameter 6 and 
the velocity V: 

tanh 6 == — B. (2.27) 


One easily finds from this (see Appendix I, § 9) 


sinh @ = tanh 6 cosh 6 = — IB. 


Finally, we obtain the sought for transformations 
x’ = xP +1 (— PB) =P (x — Br), 


v =x(—PB)+ cr =P (t — Bx). (2.29) 
For the reverse transformations we obtain 
= . / B t 
i eee (2.30) 


t=T' (t’ + Bx’). 


(The easiest way is to replace the primed quantities by the un- 
primed ones and B with —B.) 

These perfectly symmetric equations are easily identified with 
the same Lorentz transformation of Eq. (2.16); it is enough to 
make the substitution of Eq. (2.20). 

Thus, we have again obtained the Lorentz transformation, pro- 
ceeding from the invariance of the interval and the uniformity of 
space and time. There is no wonder in this, because the interval 
invariance is a direct consequence of the Einstein postulates. 

§ 2.8. Complex values in the STR. Symmetric designations. 
Sometimes for the sake of the formal convenience an imaginary 
time coordinate t= ict = it is introduced. This practice is efficient 
when used in the framework of the special theory of relativity, 
because it frees us of necessity to introduce and distinguish co- 
and contravariant coordinates (see Appendix 1, § 8). An intro- 
duction of such coordinates is inevitable in relativistic electro- 
dynamics, unless the imaginary time is used. It should be pointed 
out that an introduction of an imaginary time is only a matter 
of convenience and one can do without it. Therefore, there 1s 
nothing mysterious about the appearance of the number i. In the 
final form all formulae for coordinates and time do not contain 
the number i. This confirms once more that it plays only an aux- 
iliary role. 
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So, for the sake of the formal convenience we introduce an 
imaginary coordinate t = ict. Then 


F=PP— Y= — (2477) F =— 1). 


Here is the derivation of the Lorentz transformation using the 
imaginary variable t. Consider the plane of variables (x, t). In 
this plane the expression x?-+ +t? represents a distance from the 
origin of coordinates to the point (x, t). This distance does not 
vary on rotation of the coordinate system through the angle @ in 
the plane (x, t). 

Rotation in a conventional (Euclidean) plane through the 
angle @ is described by the following equations (see Appendix |, 
§ 2): 


x’=xcosg+ysing, y’=—xsing+ycosq, (2.31) 


where all the quantities are real. 

Let us consider rotation in the plane (x, t) for the case when 
one of the coordinates is purely imaginary. We shall assume that 
Eq. (2.31) retains its appearance in this case as well. As we shall 
see later, the geometric meaning of the equations with an imagi- 
nary variable differs essentially from that of Eqs. (2.31). So let us 
put down the sought for transformation in the form 


x’ =xcosp+iZsing, (2.32a) 
7 =Tcosp—-x sing. (2.32b) 


Let us clear up the meaning of the parameter g. It can be as- 
sociated only with the velocity V of a relative motion of the frames 
K and K’, because this is the only parameter by which they differ. 
Take any point in the frame K’ (x’ = const). It moves relative 
to K just as the whole frame does, i.e. at the velocity V. For any 
point rigidly fixed to the frame K’ one can write V = dx/dt. As- 
suming x to be a function of 7, differentiate Eq. (2.32a) with re- 
spect tot. We shall obtain dx cos p+ dtsin @ = 0, whence it fol- 
lows that 

dx 1 dx 
Ge = ic at — tan 


and, consequently, 
tan g = iB, (2.33) 


The tangent proved to be an imaginary quantity. This reminds 
us eae again that there is an imaginary quantity among the 
variables. 
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Using conventional formulas of trigonometry one can find from 
Eq. (2.33) 





Vi+tan?g VI—-BP (2.34) 


V1 + tan? ~~ / 1 — B? = 
where the designation used already in Eq. (2.15), T==(1 — B?)-"4, 
is introduced. Substituting the values of cos@ and sing in Eq. 
(2.32), we obtain the sought for transformation for the vari- 


ables x, t: 
x’ =T (x + iBt), 7 = ( —iBx). (2.35) 


The equations for the transformation of the coordinates of the 
event from the frame K to K’ must differ from those for the 
transformation of the coordinates of the same event from the 
frame K’ to K only by the substitution of the primed values for 
the unprimed ones and vice versa. Besides, the sign of the ve- 
locity V should be changed to the opposite. In this way we obtain 
from Eq. (2.35) 


x=P (x’ —iBY’), t= (@ + iBr’). (2.36) 


Of course, the same result would be obtained if Eqs. (2.35) were 
solved directly with respect to x and f. 

In Eqs. (2.35) and (2.36) one can easily change over to the 
real variables x and ¢ by substituting t = ict and t’ = ict’. Then 
we directly get the transformation formulae (2.16). The direct 
and inverse transformations of the variables x and ¢ have the 
form 

x’ =P (x — Vo) (a) x=V0(x’+ Vr) (c), 


v=P(t-tx) 0), t=r(’t+4er) 


Subsequently a comparison of the Lorentz transformation given 
in the form of Eqs. (2.25) and (2.32) will be of use to us. Recall 
that in Eq. (2.25) all the quantities are real, while in Eq. (2.32) 
a time variable is imaginary. Having made use of the relation- 
ships (see Appendix I, § 9) 


cosia=cosha, sinia=/sinha, tania=itanha, 


(2.37) 


we see that it is sufficient to substitute p = —i0 in Eq. (2.25) 
in order to convert it into Eq. (2.32). Thus, in a plane of real 
variables x, ¢ we deal in formal terms with the rotation of a Car- 
tesian system through an imaginary angle. Such a rotation resem- 
bles very little a true rotation of a Cartesian system, and Eqs. 
(2.25) defining.it are only a “parody” of Eqs. (2.31) describing a 
true rotation. We shall explain somewhat later in this section the 
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geometrical meaning of the “rotation” of the x, t axes according 
to Eqs. (2.32). and now we shall derive the Lorentz transforma- 
tion in a symmetric form to be used hereinafter. 

Let us introduce the symmetric designations of the basic va- 
riables as follows: 

X,=X, XQ Y, X32, X= icl = it (2.38) 
for imaginary time and ; 
=cl=t, xix, ey, x'oz (2.39) 
for real time. 

The set of variables (Eq. (2.38)) is convenient to use when 
relativistic electrodynamics is described. As to the set of va- 
riables of Eq. (2.39), it is adopted in the book [9] which contains 
a description of the general theory of relativity. A transition from 
the STR to the general theory of relativity is more expedient to 
make without resorting to the number i. Let us rewrite the corres- 
ponding transformation of variables (Eq. (2.30)) in the real form: 

= PY + Bx’ +0247 +0- 2, 
x! == TB!’ + Px" +0-x%+0- 2%, 


PHO xX tO x4 1% 40-x%, (2.50) 
P=0 x40 xVM 40-474 1-2", 
and using imaginary time (Eq. (2.36)): 
x= Pxi t+0-46+0- x5 — iB xj, 
x2=O+exi tle xg +0-245+0> x4, (2.36) 


y= O0-x7 +0-x2.+1°434+0- xj, 
xy= (BP xf +0-xo+0- x3 +70 + xh. 
The transformations (2.30’) and (2.36’) can be written in the 
abbreviated form: 
x'==a,,x"” (a), | x,=G,,x, (6). (2.40) 


In Eqs. (2.40a) and (2.40b) summation is performed for an 
index & running from 0 to 3 in (a) and from 1 to 4 in (b). The 
index i is “free” taking on all values from 0 to 3 in (a) and 
from 1 to 4 in (b), the coefficients a. and &,». making up the 
matrices 


Tr BOO roo —ibBr 
rBr o0oO " 010 ; 

tn 0010 (a) TT a 001 0 (5), 
0 0 01 iBr 0 0 r 


(2.41) 
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respectively which are referred to as the Lorentz transformation 
matrices. Matrices of this form are always used for transforma- 
tion of coordinates and time on transition from one inertial frame 
of reference to another. These matrices differ only in the value of 
the relative velocity v’, i.e. in different values of B = V/c. 


v 





(a) (0) 


Fig. 2.6. A geometric illustration of the Lorentz transformation The Lorentz 
transformation reduces to the rotation of the x and t axes through the angle 
@ = arctanB around the origin in the direction of the coordinate angle bisector 
to their final positions x’, 1’. The straight lines x’ = const are now parallel to 
the Ov’ axis and the straight lines t’ = const are parallel to the Ox’ axis (we 
have passed over to the rectilinear oblique-angled system of coordinates). The 
transition from the frame K to K’ corresponds to the convergence of the t’ and 
x’ axes (a); the inverse transition to the divergence of the t and x axes (0). 


The equations of an inverse transition, i.e. a transition from 
the frame K to K’, are obtained by substituting —B for B. Let us 
designate the matrix of the transition from the frame K to K’ by 
aje, so that 

De 0 a Ak He (2.42) 


For the matrix with the real elements the indicated substitution 
brings about a completely new matrix aj. But the matrix aj, 
turns into Ge, (with lines and columns transposed), when —B is 
substituted for B, so that 

X, = G, X,. (2.43) 


§ 2.9. A geometric illustration of the Lorentz transformation. 
Since in our choice of a relative position of inertial frames of 
reference the coordinates y and z do not vary, it is sufficient to 
consider the transformation of the reference frames in the plane 
(x, t). 

Let the x and t axes of the frame K be depicted by two mutually 
perpendicular straight lines (Fig. 2.6). To draw the axes of the 
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frame K’ in this diagram, let us resort to Eq. (2.29) from which 
the origins of the reference frames K and K’ are seen coinciding 
(when x = 0 and t= 0, x’ = 0 and t’ = 0 as well). The point 
‘== 0, ie. the origin of the frame K’ moves at the velocity V 
relative to the frame K. Hence, its motion is illustrated in this 
diagram by a straight line making an angle @ with the axis t, 
the angle @ being defined by the relation m = arctan B. But the 
straight line x’ =0 is the time axis in the frame K’. Conse- 
quently, the Lorentz transformation of the t+ axis reduces to an 
inclination of the t’ axis by an angle q to the t axis. 

The x’ axis is defined by the condition 1’ = 0. But from Eq. 
(2.29) it is seen that this condition is satisfied in the frame K on 
the line t = Bx. Of course, the +’ axis could also be found from 
the condition x’ = 0, but then we would obtain the same straight 
line x = Br from (2.29). Thus the equations for the new axes will 
be written as follows: 


axis 1’: =a" axis x’: t= Bx, (2.44) 


The angle between the x and x’ axes is defined from the relation 
q = arctan B. Thus, the Lorentz transformation reduces to a con- 
version of the rectangular reference frame x, t to the oblique- 
angled one x’, 1’; the x and t axes rotate around the origin in 
the direction of the bisector of the coordinate angle through the 
same angle m = arctan B (see Fig. 2.6a). This is what rotation 
through an imaginary angle means! In formal terms rotation of 
a rectangular system just considered does not at all resemble ro- 
tation of a Cartesian system of coordinates. 

Our result shows that we cannot remain within the framework 
of orthogonal axes x, t when considering inertial frames of ref- 
erence and resorting to a geometric illustration of this transfor- 
mation. Even if the axes of the initial frame are orthogonal, a 
transition to any frame K’ makes it oblique-angled. Fig. 2.66 illu- 
strates the transition from the orthogonal frame K’ to K accord- 
ing to Eq. (2.30). But an emergence of oblique-angled coordinates 
makes it necessary to distinguish between co- and contravariant 
coordinates (see Appendix I, § 8). That is why it is so difficuit 
to bypass these notions in the STR without hiding beyond the 
number é (see § 2.8). 


CHAPTER 3 


CONSEQUENCES 
OF THE LORENTZ TRANSFORMATION. 
THE CLASSIFICATION 
OF INTERVALS AND THE PRINCIPLE 
OF CAUSALITY. THE K CALCULUS 


One should not expect to obtain any new consequences of the 
Lorentz transformation which are not obtainable directly from 
the Einstein postulates. In the final analysis the Lorentz trans- 
formation itself is a consequence of the Einstein postulates. The 
beginning of this chapter is indeed dedicated to the analysis of 
the results that have been already obtained in Chapter 2. Surely, 
they are obtained in a much simpler way from the Lorentz trans- 
formation and we shall take advantage of this fact. We shall not 
even discard an opportunity to show how all these consequences, 
including the law of velocity transformation, can be obtained even 
without resorting to a construction of a coordinate system (the 
K calculus). Naturally, the question may arise as to why the 
Lorentz transformation is significant, since all the results ob- 
tained until now can be derived by other means. The point is that 
despite the significance of the results obtained, they are not every- 
thing we need yet. In order to get convinced in the validity of 
the principle of relativity, one has to know how the basic equations 
of physics are transformed on transition from one IFR to another, 
It is the Lorentz transformation that makes the basis for such 
transformations. « 

§ 3.1. On the measurement of lengths and time intervals. The 
relativity of simultaneity. The Lorentz transformation makes it 
possible to compute the coordinates of an event (including a time 
“coordinate”) on transition from one IFR to another. But an 
event is only an element of a physical phenomenon, and in the 
final analysis the major task of the STR is the corresponding 
computation of the physical quantities observed. However, prior 
to anit isting the task it is necessary to dwell on the measuring 
procedure for basic physical quantities. The most important 
physical measurements are those of distances (lengths) and time 
intervals. We are accustomed to measuring lengths of objects or 
distances between points, when these objects or points are motion- 
less relative to us. If an object is at rest, it is sufficient to trans- 
pose a unit scale along the length to be measured the necessary 
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number of times. This is how we do it in our everyday life, when 
measuring, say, cloth or the length of a room. 

It is also rather easy to measure a time interval between two 
events occurring at the same point where a clock is located. One 
has only to register the moment when the first event occurred and 
the moment of time when the second one did. The difference of 
the clock’s readings will give the time-interval between the events. 
In just this way, for example, a duration of a lecture or a football 
match is ascertained. 

But how to measure the length of an object moving relative to 
us? Let a train move past us at a great velocity and we want to 
determine its length. It is far from being simple for one man to 
do this. He has to note simultaneously the positions of the head 
of the train (the locomotive) and the tail of the train relative to 
some certain motionless points on the ground. But as soon as he 
notes the position of the locomotive and begins to turn his head, 
the tail of the train will have gone ahead. Consequently, one 
should take special care to mark simultaneously the positions of 
the head and the tail of the train. 

Having marked the simultaneous positions of the head and the 
tail of the train on the ground, we can readily measure the 
distance by conventional means used for measuring motionless 
objects. 

And how to measure a time interval between events occurring 
at different points in space? Recall how they measure a time which 
a sprinter takes to run a hundred metre race. The events in this 
case are represented by the sportsman's start and finish. And there 
is only one clock! A starter’s shot serves as a signal to start the 
race and to actuate a timer located at the finish. Sound propa- 
gates in air at the velocity of 330 m/s, so that the sportsman will 
start running before an umpire located at the finish actuates his 
timer. This is not very essential, though, because the velocity 
of the runner is very small (at best about 36 km/h = 10 m/s, 
which is small even relative to the velocity of sound). But from 
this cxample one may perceive that the determination of a time 
interval between two events happening at different points in 
space requires attention. 

In § 2.1 we discussed how such a problem is dealt with in the 
theory of relativity: each inertial frame has its own coordinate 
system and motionless clocks are located at all its points, 
wherever needed. These clocks are synchronized within that re- 
ference frame, so that equal readings of the clocks correspond 
to the same moment of tine in the frame. 

When we turn to comparing events in the two inertial frames K 
and K’, we link the readings of the synchronized set of clocks of 
the frame K with the readings of the analogous set of the frame 
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K’ by assuming t =0 and ¢t/ = 0 at the coinciding origins O 
and O’ (see § 2.2). Recall that the Lorentz transformation is just 
a conversion of the coordinates of the event in the frame K, i.e. 
the coordinates (x, y, z, ¢), into the coordinates of the same event 
in the frame K’. From Eqs. (2.16) or (2.29), or (2.40), we obtaia 
(x’, y’, 2’, ’) expressed via (x, y, z, t). Of course, all these sur- 
prising, from the viewpoint of the “common sense”, consequences 
of the Einstein postulates that we discussed in Chapter 2, can be 
obtained from the Lorentz transformation. Now we shall be oc: 
cupied with this. 

First, let us derive two convenient equations that we shall need 
later. Consider the two arbitrary cvents I (x1, y1, 2, t1) and 
II (xX2, ye, Z2, t2). The conversion of coordinates and times of these 
events into the frame K’ yields, according to Eq. (2.37a, b), 


’ B , 
R= (t,-—> x). | =P (x — Vt), 
n B ’ 
h=0(4—-Sx,). | =P (x,— V4). 


Making up the differences 2—1{ and x2—x{, ie. subtracting 
the lower equations from the upper ones, and designating Ax = 
=x—x, Axi =x—xi, A =h—t, At=t—h, we obtain 
the necessary equations (the inverse transition equations are also 
written out): 


Ax’ =T(Ax—VA), (3.1) | Oe =P (Ax $V AP), 3.1) 
AY =P (Ar—Sdx), (3.2) | =P (A +2 Ax’). (3.2) 


In fact, Eqs. (3.1) and (3.2) as well as Eqs. (3.1’) and (3.2’) are 
the Lorentz transformations for differences of spatial coordinates 
and times of the two events. These equations should be supple- 
mented with the relations Ay’ = Ay and Az’ = Az. 

It immediately follows from Eq. (3.2) that two simultaneous 
events in the frame K are not simultaneous in the frame K’. In- 
deed, assuming At = 0 in Eq. (3.2), we obtain 


Av =—T = Ax. (3.3) 


It is seen that At’ #0, if Ax #0. But if Ax =O at At=0 as 
well, the events either coincide or happen in the plane x = const. 
For such events At’ = 0. 

Two conditions follow from Eq. (3.2). Provided they are satis- 
fied, one may ignore the relativity of time intervals between 
events. First, one should suppose B < 1; then T = | and the fol- 
lowing expression can be written: At/ = At — (V/c) (Ax/c). The 
second term in this expression can be ignored if the ratio Axe 
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is rated small. This second condition is positively satisfied if 
events occur in a limited region of space along the x axis, i.e. at 
small Ax. No limitations are imposed, however, on the region of 
space along the directions y and z, since all events in the plane 
x = const happen at the same moment of time according to the 
clock of the frame K’. 

Certainly, the relativity of simultaneity reveals itself in the 
dis-synchronization of clocks that we examined in detail in § 2.4. 
It is sufficient to put t’ = 0 in Eq. (2.37b), and we immediately 
obtain Eq. (2.9) which did not come easily before. As soon as we 
assume Af = 0 in Eq. (3.2), Eq. (2.8) is obtained. This example 
shows how much can be hidden in plain, by appearance, “trans- 
formations”. 

In everyday life a violation of simultaneity is not perceptible: 
the time difference At is proportional to B/c, as it is seen from 
Eq. (3.2), provided that At = 0 is assumed (simultaneity in the 


frame K). From the same Eq. (3.2) it is seen that if Zax has 


a substantial value, At’ can also assume a substantial value pro- 
vided that Ax is great. 

It is very important to point out that the relativity of simul- 
taneity is dictated by the finiteness of the velocity of light. If a 
formal passing to the limit c—> oo is performed (in fact, it means 
that B-0), simultaneity becomes absolute. This result corres- 
ponds to the case of small relative velocities of reference frames. 

It is seen from Eqs. (3.1) and (3.1’) that two events occurring 
at points of space with the same x coordinate in the frame K, i.c. 
at the same point of the frame K, will have different x’ coordinates 
in the frame K’. Indeed, from Eq. (3.1) we get Ax’ = 
== [(Ax — V At) = — TV At. But At is a proper-time interval, and 
so [At = At’. Hence, Ax’ = — V At’, The meaning of the last 
result is evident; it defines the displacement of the point x rela- 
tive to the frame K’ registered in the frame K’. 

§ 3.2. Relativity of length of moving rulers (scales). A visible 
shape of objects moving at relativistic velocities. Let us consider 
now a measurement of the length of a moving ruler. Let the 
clocks in the frame K be synchronized and spatial marks made. 
Suppose that a ruler oriented along the x axis moves relative to 
the frame K at the velocity V. One can fix the frame K’ to this 
ruler. How to measure the length of the same ruler in the frame 
K? Obviously, the coordinates of the ruler’s ends have to be de- 
termined simultaneously in the frame K. This requirement of si- 
multaneity leads us to the strange result which we have already 
discussed in § 2.3: the ruler’s length, when measured in a rei- 
erence frame relative to which the ruler moves, turns out to be 
less than the length of the same ruler in the reference frame 
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where it is at rest. So, let the ruler be at rest in the frame K’ and 
the coordinates of its ends be xj and x3. By definition, its length 
in the frame K’ called, as we have indicated, the proper length 
is equal to x3—x,. The proper length of the ruler is designated 
by Io, ie. fy) == x; — x}. Since the ruler is motionless in the frame 
K’, one may not worry about the simultaneity of measurements of 
the coordinates of its ends: its length can be measured by any con- 
ventional means. 

In the frame K the coordinates of the ruler’s ends will be de- 
termined according to the Lorentz transformation (see Eq. 
(2.374) ): : 

x,=T(x,—Vt,), x, =P (x, — V4). 


Having formed the difference x3 — x{, we obtain 
x, — x, =P {(x, — x,) — V(t, — 4}. (3.4) 


The proper length of the ruler constitutes the left-hand side of 
Eq. (3.4). In the braces of the right-hand side there are Ax and 
At for the two events, the position of the ruler’s left end being x2 
at the moment ¢ and the position of the ruler’s right end being x 
at the moment ¢, (in the frame K). 

The quantity Ax will be the ruler’s length in the frame K only 
if the positions of the ruler’s ends are registered simultaneously 
in this frame. Otherwise we can get any value for Ax. Recall the 
example cited at the beginning of § 3.1 concerning the determina- 
tion of the length of a moving train: you have just marked the 
position of the tail of the train and are slowly turning your head 
toward the locomotive. If you mark the position of the locomotive 
and then measure the distance between the marks, the distance 
measured will be greater than the proper length of the train. Now 
proceed in the opposite direction: first mark the position of the 
locomotive and then turn your head slowly to the tail of the 
train. It is easy to contrive that if one turns his head slowly 
enough, the train’s length may even get equal to zero. This is 
just what Eq. (3.4) expresses. Thus, in order to determine the 
ruler’s length unambiguously in the frame K, one has to consider 
the two simultaneous events in K: the coincidence of the ruler’s 
left end with a certain spatial mark, say x,, and that of the ruler’s 
right end with another spatial mark, say x». This means that 
Ax = 1 only if A¢ = 0. But then it follows from Eq. (3.4) that 
{, = Tl, where / is the ruler’s length determined in the frame K. 
According to custom the last equation is written as follows: 


t= / —-L=hvi—-F. (3.5) 
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Of course, under the same conditions of the problem one may 
make use of the inverse transformation equations: 


x= (x, + V0), x,=T0 (x, + Vi). 


Subtracting the right-hand equation from the left-hand one, we 
obtain 


Ax=T1,+ TV ((—0)=T,4+PV as. 
But Ax becomes the length / only under the condition At = 0. 
Having expressed At’ in terms of Af and Ax from Eq. (2.37b), 
At’ = (Al — Ar), and having assumed At = 0 in this equa- 
tion, we obtain Eq. (3.5) again. 
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Fig. 3.1. Measurement of the length of a moving ruler. 


Here are two more methods of determining the length of a 
moving ruler which, naturally, bring about the same result. Let 
the ruler AB be at rest in the frame K, so that its left end A 
coincides with the origin O (Fig. 3.1). At the moment ¢ = 0 the 
origin O’ of the frame K’ coincides with O, the clock’s reading 
in O’ being t’ = 0. Then the observer located at B registers the 
moment when O’ passes the point B. Let it be the moment /. The 
velocity of the frame K’ is known. Therefore, the proper length of 
the ruler J) = VA. This is the proper length of the ruler because 
it is measured by the scale and clocks of the frame K in which 
the ruler is at rest. The velocity is also determined in the frame A. 
On the other hand, the observer from the frame K’ located at the 
point O’ will register his passing by the points A (at the moment 
t == 0) and B (at the moment ?’) using his clock. But the ruler 
moves past this observer also at the velocity V (in the opposite 
direction), and he will find the length of the moving ruler to be 
equal to / = V-Al’. But At’ = (t’ — 0) is the proper-time interval 
between the two events, the coincidence of O’ with A and then 
with B. As to Af =(f—0)= 4, it is the time interval between 
the same events in the frame K. According to Eq. (2.2) At=T At’, 
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. l AU { 
and inasmuch as kow=?T: 


The relativity of lengths is a direct consequence of the relativity 
of simultaneity. Let a ruler be at rest in the frame Kk’, the coor- 
dinates of its ends being x; and x{. The proper length [, = Ax’ = 
=X— xX. 

Let two light bulbs fixed at the ruler’s ends in the frame K’ 
flash simultaneously (At’ = 0), and let these two events be reg- 
istered in the frame K. Let us find the distance between the 
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Fig. 3.2. Relativily of the rulers’ fengihs as a consequence of the relativity of 
simullaneity of two events. 


points at which these events occur in the frame K: Ax = 
= [(Ax’ + V At’) = TAx’ = Ty (see Eq. (3.1’)). This means that 
the distance between the points at which these two events occur 
is greater than the proper length of the ruler. But this distance is 
not the ruler’s length which will be measured by the observer 
from K. In order to find the ruler’s length in the frame K, one 
must find the coordinates of the ruler’s ends simultaneously in 
this frame. The events which are simultaneous in the frame Kk’ 


are delayed relative to one another by A¢‘=—TI (a + se Ax’) = 


ro 
=I oh in the frame K (Fig. 3.2). But during this time the 
ruler’s end x; will shift in the direction of motion by the 


distance V A¢ = P'B7/y. Hence the measured length of the ruler 
will be less than If) by V-Af, i.e. 


[=f —TBh=h v1 — B. 


We shall also recall that Eq. (3.5) was obtained directly from 
the Einstein postulates (§¢ 2.3). Now it is time to dwell on the 
physical meaning of Eq. (3.5). We have found that the length 
of a physical object, for example a ruler, is relative, i.e. different 
in different reference frames. The ruler possesses a maximum 
length in the frame in which it is motionless, i.e. the proper length 
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is the greatest. If the ruler’s length is determined in an inertial 
frame, relative to which the ruler moves, its length will prove 
to be less than the proper length. It follows from Eq. (3.5) that 
if the ruler were able to move at the velocity c, its length would 
be equal to zero. But this just cannot happen: any object pos- 
sessing a finite rest mass, including any feasible frame of re- 
ference, cannot reach the velocity c. . 

What does a contraction of a ruler mean? Frequently one may 
hear the question as to whether the ruler “actually” gets shorter. 
To begin with, it is clear that no contraction of the ruler can 
take place. This follows from the basic principle of the STR, 
the principle of equivalence of all IFRs. The physical state of the 
ruler is the same in all IFRs. So there is no question of an 
emergence of any stresses leading to the ruler’s deformation. The 
ruler’s “contraction” comes about solely due to differing methods 
of length measurements in two reference frames. On the other 
hand, the observed relativity of the ruler’s length is not due to 
the observer’s illusion. This result can be obtained by any reaso- 
nable method of measuring the length of a moving object. More- 
over, analysing physical phenomena in a given reference frame, 
the quantity / should be adopted as the object’s length according 
to Eq. (3.5), and not Jo. 

It is extremely unfair to speak of the “Lorentz contraction” 
when alluding to Eq. (3.5), although indeed G.A. Lorentz was 
the first to suggest this equation in 1892. However, it was inter- 
preted quite differently (see Supplement II) from what we have 
just discussed. 

It was Einstein who clearly said about the reality of the Lo- 
rentz contraction: “There is no point to question whether the 
Lorentz contraction is real or not. The contraction is not real, 
since it does not exist for an observer moving with the object. 
However, it is real, since it can be fundamentally proved by 
physical means for an observer not moving with the object.” 

Another question is often raised: what is the “actual” length 
of ‘he ruler? This question has no meaning, if asked “in a broad 
sense”. The question about the ruler’s length regardless of a 
frame does not have any meaning. The ruler has its length in 
each reference frame; this is just its “actual” length. All inertial 
frames of reference are equivalent and so are the ruler’s length 
values determined in these frames. In any reference frame the 
ruler will behave as if it has the length determined in this frame. 
Although all IFRs are equivalent among themselves, there is still 
one “selected” coordinate system to which we are accustomed. 
This is the system in which the ruler is at rest. From the view- 
point of our customary concepts this is just the “actual” length of 
the ruler. We are prone to adopt it as a true length, but this 
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length defines the behaviour of the ruler only in this “inherent” 
reference frame. 

Finally, the last remark. The ruler exists objectively, i.e. outside 
our consciousness and ourselves. But is there any length before 
measurements are made? A length as a certain number emerges 
as a result of measurements and a choice of units of length. Of 
course, the ruler possesses extent, or length, if you want, as a 
quality before a measurement, but the numerical value of length 
originates only after measurement. Thus, the numerical value of 
length of existing objects emerges after measurement, and the 
result of measurement, as we have established, depends on what 
kind of instruments are used. 

Let us consider rulers having the same proper lengths 4 and 5 
in the two reference frames K’ and K’”. Measurements carried out 
in the frame K’ will give 


ol: (3.6) 
Measurements carried out in the frame K” will give 
>t. (3.7) 


The inequalities (3.6) and (3.7) are far from being contradictory 
because (3.6) was obtained for the scales and clocks of the frame 
K’ and (3.7) for the scales and clocks of the frame K’. The 
difference in the values of the ruler’s length is dictated by the 
fact that simultaneity in the frames K’ and K” is defined different- 
ly. The difficulty in the interpretation of the conclusions of the 
special theory of relativity lies not in the existence of relative 
quantities, but in the detection of the equivalence of all inertial 
frames of reference. The inequalities (3.6) and (3.7) indicate just 
this equivalence. 

The conclusion of the special theory of relativity concerning 
the relativity of the length of a moving object is unusual partially 
because in our everyday life we do not perceive such an effect. 
Let us consider the fastest motion within reach, that is the orbital 
motion of the Earth. In this case V = 30 km/s. The ratio V/c = 
= 10-4 and [=hV1— 10x h(1—5- 10-*) = ly. It should 
be stressed again that the contraction of lengths is a direct con- 
sequence of the finiteness of the velocity of light. If the velocity 
of light were infinite, the ruler’s length would be the same in all 
reference frames in accordance with Eq. (3.5). This can also be 
seen from the fact that in the case of c-> oo the simultaneity of 
events becomes absolute. 

Although until now we always discussed the relativity of 
lengths of objects (rulers), it should be borne in mind that actually 
we dealt with the relativity of distances between two motionless 
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points in one reference frame when measured by instruments from 
another frame. 

Let us consider a cube at rest in the frame K with the sides 
Ax, Ay, Az and the proper volume A¥o = Ax Ay Az. According to 
the Lorentz transformation in an arbitrary IFR K’ we have 
Ax’ =(1/T)Ax, Ay’ = Ay, Az’ = Az and, consequently 


AW’ = Ax’ Ay’ Az’ = (1/I) Ax Ay Az = (1/) AY. 


Hence, the change of the cube’s volume on transition from the 
frame K to the frame K’ is determined as 


Y=7%,VI- Be (3.8) 


It follows from the result obtained that the proper volume of an 
object is the invariant of the Lorentz transformation: 


I’ dy’ =T" dv" =P. 


Is it possible to observe directly the Lorentz contraction by, say, 
taking photographs of a rapidly moving object? In the first paper 
by Einstein dedicated to the theory of relativity one may read the 
following: a moving body which at rest has a spherical shape is 
observed from a stationary frame as an ellipsoid with the semi- 
axes R(1 — B?)'4, R, R. The word “observed” here can be inter- 
preted as a visual observation or photographing. For about fifty 
years after the advent of the STR everybody was sure that a 
visible shape of a retativistic sphere was an ellipsoid. However, 
it turned out that the problem of the visible shape of objects ~ 
moving af a relativistic velocity requires many circumstances. to 
be taken into account, and that a rapidly moving sphere remains 
spherical. If one assumes that an eye and a photographic plate 
fix an instantaneous image produced by light, this will mean that 
the image is produced by rays coming from different sections of 
the observed object simultaneously on a retina or a photographic 
plate. But if the optical paths of light going from various points 
of the observed object are different, a photographic plate will 
register the positions of the object's points at difierent momenls 
of time prior to the moment of photographing. The whole effect 
is caused by the finiteness of the velocity of light. Using a plain 
example, we shall show why a visible shape of a moving sphere 
coincides with that of a stationary sphere. 

Let us imagine a luminous cube moving along a straight line 
parallel to one of its faces past a photographic camera or an 
observer. The photographing or observation takes place at the 
moment when the centre of the cube crosses the line perpen- 
dicular to the motion direction and drawn through the point at 
which the observer is located (Fig. 3.3a). Naturally, we must 
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know beforehand that the moving object has the form of a cube 
in its own reference frame. 

At a definite moment all photons emitted simultaneously (in 
the frame fixed to the plate) at the points of the line AD will 
reach the plate together with those emitted at the point B earlier 
by the time interval //c, 1 being the edge length of the cube. But 
at that moment the point B was in the position B’. The simul- 
taneous determination of the : 


positions of the points A and 3B c BA D 
D in the frame fixed to the 2 une 
plate leads, in accordance Light v 

with the conventional rule of 

length measurement, to the A D f (c) 
Lorentz. contraction: /’= i 1 

== [(1—$?)"*. On the other HM Aue iv -% | 


hand, BB’ = (l/c)v = Bl. 
From Fig. 3.36 and c one 
can realize that the picture : 


of a moving cube that would 


lsing ccasp 





be seen by a motionless ide- Photographue 
alized observer coincides with plate 
the picture of a motionless (a) (d) Observer 


cube turned through a cer- 
tain angle g. This angle is , . 

F : Fig. 3.3. Visual observation of a cube mov- 
determined from the relation jing’ past an observer: (a) the mutual dis- 
sing = B. This is a particu- position of the observer and the cube at 
lar case of a more general 9=0; (5) the visible picture of the mov- 


ing cube, (c) the possible interpretation 
result: any three- e-dimensional of the visible picture hy one observer. the 


moving object is seen turned rotation of the cube through the angle 


“at a-given moment. If the = arcsinB; (d) the observation of the 
“cube is so positioned relative moving cube at the angle 0 

to the observer that it is seen 

at the angle 6’ relative to the x’ axis when at rest, it will be 
observed turned through another angle. If the cube is removed 
far enough from the observer, light travelling from it can be 
taken for a parallel pencil of rays. When this pencil is observed 
in the frame X, it propagates at the angle @ to the x axis, as 
seen by the observer from K, the angles @ and 0’ being related 
by the equation (see Eq. (7.11)): 


cos 8 = (cos &’ + f)/(1 + B cos 8’). 


A variation of a plane wave front direction on transition from one 
reference frame to another, moving relative to the former one, 
is called an aberration of light. The image registered on the plate 
in this €ase corresponds to the cube observed in the frame K at 
the angle @ and turned through the angle 6 — 0’. 
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From this it is clear that a sphere turns as well, but its 
outline does not vary, of course (see [28] for details). 

And still — can an object be photographed, so that a plate will 
register a relativistic contraction? To avoid difficulties associated 
with a turn, one can consider a one-dimensional object which is 
easy to compare with its own image on the plane. For this pur- 
pose the observer in the frame K must know beforehand that a 
rod moves along a given direction. The rod is at rest in the 
frame K’, and its proper length is also known. In this case the 


Kh KA 
' ' 
ev pi 3 
1 —- 4 
I 0; 
La —— st 
Hf zz" 
iL 1 
+ 
\P 
LLL 
Fhatographte plate 


Fig. 3.4. The basic arrangement permitting the Lorentz contraction of a movin 

rod to be photographed. When the middie of the rod O’ gets onto the iine PO, 
a special device opens the shutter at P for an instant to pass through the rays 
emitted by the rod’s points at the moment when the point O’ crosses the line PO. 


observer constructs a counterpart of the moving rod in his 
frame K and takes a photograph of the moving rod against the 
background of its own length. 

The simplest arrangement intended for photographing a rod 
experiencing the Lorentz contraction could be of the type shown 
in Fig. 3.4. The rod is parallel to the x axis and moves along it. 
An observer is located on a perpendicular to the x axis, the per- 
pendicular going through the middle of the rod’s counterpart at 
rest in the frame K’. As soon as the middle of the moving lumi- 
nous rod finds itself on the perpendicular, a mechanism triggers 
a camera’s shutter, and the simultaneous positions of the rod’s 
ends get registered on a photographic plate. As to the stationary 
counterpart of the rod, it can be photographed at any time, of 
course. Most likely, this is also an “imaginary experiment”. 

It should also be mentioned, however, that a “turn of a cube” 
moving at a relativistic velocity was qualitatively photographed. 
We refer the reader interested in details to [27]. 

In general, a “visible” picture may differ quite essentially for 
different observers. Here is a simple example. Let an observer 
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be at some distance from a plane at which electric bulbs flash 
simultaneously in the frame where the plane and the observer are 
at rest, or the light flash is produced. Then due to the finiteness 
of tne velocity of light the observer will see that the plane starts 
glowing gradually with an “illumination wave” running from the 
centre to the periphery. In particular, if an infinite thread is illu- 
minated instantaneously, the remote observer will see two !umi- 
nous points running apart. 


Owing to the same reason a visible (observable) velocity may 
also dtffertrom a real one: it may prove to be even higher than 
the velocity of li lent TaaT.  sccvate between events 

§ 3.3. Relativity of time intervals between events. Suppose that 


two events occurred at some point of the frame K’ at time mo- 
ments ?’, and t's. The time interval separating these events can 
be registered by the clock located at that point. According to the 


Same polat-of-acortsin reference frame and- registered by te at the 
same a certain reference frame and registered by the 
Same clock of this Ilrame 1s called a praper-time interval he aati 


The events. Designatig the proper-time interval by Af®, we obtain 


‘singe nas ac 


In our case 
A= Al =t,— 1. 


Let us define now a time interval between the considered events 
in the frame K. According to Eq. (3.2) At =TAt/ =T A? This 
result is already familiar to us (Eq. (2.2)). 

However, if the events in the frame K’ occurred at one point 
in space, this is not the case in any other frame K. Indeed, let 
two events occur in the frame K’ at different moments but at one 
point, 1.e. Ax’ = 0 but At’ + 0. According to Eq. (3.1) Ax =TY, 
At’ #0. The meaning of the last result is obvious: all points of 
the frame K’ move relative to K at a velocity V, and Ax is just 
a displacement of any point of the frame K’ during the interval 
considered. Since [At = TA? = Af, then Ax = V Af. 

Therefore, time intervals between the same pair of events turn 
out to be different in different IFRs. We shall observe the least 
time interval between events in the reference frame in which these 
events occur at the same point and, consequently, are registered 
by the same clock. In other words, the proper-time interval is the 
least. _ 

A time interval between events in terms of the frame K can 
also be measured as follows. The point of the frame K’ at which 
the two considered events occur, moves at the velocity V relative 
to the frame K. If the events occurred in the frame K at the 
points x, and x2, the time interval Af between the events is, 
obviously, equal to At = Ax/V. But according to Eq. (3.1) Ax = 
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s= FV Al’ (\x’ = 0), whence we obtain again 





Ai=PraAr= (3.9) 


1— B° 


The direct experimental confirmation of the conclusion of the 
STR about the relativity of time intervals is widely known. Lighit 
elementary particles (muons) were discovered, on the one hand, 
in a laboratory as a result of nuclei splitting and, on the other 
hand, in cosmic rays. The lifetime (the half-life) of muons 
measured in laboratory conditions proves to be equal to about 
2-10-® s. This lifetime can be regarded as a proper lifetime, since 
the velocities of laboratory muons are non-relativistic (see Eq. 
(3.9), the velocity of the coordinate system K’ is the velocity of 
the frame fixed to a muon). In the interval Af® = 2-10-§ s a muon 
breaks up into other particles. 

It is known that the muons observed in cosmic rays at the sur- 
face of the Earth originate in the upper layers of the atmosphere 
at the height of from five to six kilometres due to the primary 
cosmic radiation. The velocity of the generated muons moving 
toward the Earth is comparable with that of light. According to 
Eq. (3.9) the half-life Af of a muon in the laboratory frame is 
equal to At =T Af. In the case of muons IT & 10 and in the 
laboratory frame of reference At = 2-10-5 s. During this time a 
muon travels the distance c Aft = 3-10! &2-10°5 + 6 km. But 
for the relativity of time intervals, muons would have travelled 
only about 600 m and we would not have observed them at the 
sea level. Thus, only the relativistic transformation of time in- 
tervals makes it possible to explain muon showers observed on 
the Earth. 

An excellent illustration of the relativity of time intervals is 
provided by the Doppler effect. This effect consists in the fact 
that if a light source-and-arr Observer (a receiver) move relative 
to each other *, the light frequency determined by the observer 
differs from the frequency that would be observed by him if he 
was Stationary relative to the source. It is natural to refer to a 
light frequency determined by the observer stationary relative to 
the source as a proper frequency of light. Let us designate it by 
@o. All our conclusions pertain to vacuum. 

Let us first consider the case when the direction of light propa- 
gation coincides with the direction of the relative velocity of the 
source and the observer, the so-called radial Doppler effect. 


* Note that in our subsequent discussions of “classical” physics we shall 
always utilize a typically relativistic assumption that no material medium is 
needed for light propagation. That is why only the relative velocity of a light 
source and an observer is essential for us 
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In Fig. 3.5a this corresponds to light propagation along the x, x’ 
axis. To consider the Doppler effect, one can imitate a light wave 
by sending short pulses from the source at the interval (period) T. 
Let us suppose now that such pulses are sent from the origin of 
the frame K, i.e. from the point O. Any observer at rest in this 
frame will discover that these pulses come to him at the same 
intervals 7. Now let us fix the observer to the frame K’. Let the 
first pulse be sent from the point O at the moment when the 
origins O and O’ coincide (¢ = 0, t/ = 0). Naturally, the same 
pulse will be registered by the observer at the point O’. The next 
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Fig. 3.5. The derivation of equations of the radial Doppler effect: (a) an 
observer moves away from a source; (6) an observer approaches a source. 


pulse leaves the point O after the time interval T (by the clock 
of the frame K). But at that moment the origin O’ is already at 
the distance V7 from O. The velocity of light relative to O’ in the 
frame K is equal to c—V, so that a light ray will need tie 
additional time V7/(c — V) to reach the origin O’. Therefore, the 
observer at O will receive the pulse after the time interval 


/ VT Vic l 

r=t+—. =(1 +5 =) T=T T (3.10) 
when registered by the clock of the frame K. We have obtained 
the time interval between the first and the second signals received 
by the observer at O’ and registered by the clock of the frame K. 
The reception of the signal in the frame K’ takes place at one 
point O’ and the transition to the proper-time interval can be 
accomplished in accordance with Eq. (3.9): 7,=(I/I')T’, so that 
the period To will be equal for the observer at O’ to 


To=qg? Vi— Ba / tte 1. (3.11) 


Passing over to frequencies (@==2n/T, w’ =2n/T(), we obtain 


, I1—B 
of = 4/158 oo (1—B). (3.12) 
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The last operation is performed in the easier way as follows: 
the numerator and denominator under the radical sign are multi- 
plied by (1 — B). The right-hand side of Eq. (3.12) is immediately 
obtained when the term B? is ignored in the denominator ex- 
pression (1 — B?). This equation is valid for an observer moving 
away from the source: the observed frequency is less than the pro- 
per one. When an observer approaches the source (in Fig. 3.56 the 
point O” is to the left of O) and the signals are sent from the 
frame K”, the analogous reasoning (the source and the observer 
converge and the relative velocity is c+ V) brings about the 








equations 7” = TE T, Toma/ spt and finally 
ot =a / TEE oy = (1 +B) oy (3.13) 


Eqs. (3.12) and (3.13) (without approximations) are the exact 
relativistic equations describing the radial Doppler effect. They 
will be obtained later (§ 7.2) on the basis of strict rela- 
tivistic equations. But the derivation cited here is faultless in 
terms of physics. At the same time it clearly shows that the 
Doppler effect is formed of two independent parts: (1) it is con- 
nected with a continuously changing distance between the observer 
and the source; (2) it is also connected with the transformation 
of time intervals between events on transition from one reference 
frame to another. The first factor does not pertain to the theory 
of relativity in the slightest degree. The radial Doppler effect 
follows qualitatively from the classical theory, with the corres- 
ponding equation being obtained from Eq. (3.10). There is noth- 
ing to change in Eq. (3.10) in terms of the classical theory, since 
time intervals in all reference frames are the same. The difference 
between the classical equation and the relativistic one is essential 
only to the order of magnitude of B®. The last approximate equa- 
tion in (3.12) just gives the classical expression obtained froin 
Eq. (3.10). Inasmuch as the ratio B is determined by the relative 
velocity of the source and the observer, it is very small at least 
for macroscopic sources, and the Doppler effect is defined pri- 
marily by a variation of the distance between a source and an 
observer. However, there is a case of a zero relative velocity of a 
source and an observer, although the frames, in which the source 
and the observer are at rest, move relative to each other. This hap- 
pens when the moving source is observed at the moment when its 
velocity is perpendicular to the observation direction (the line of 
vision) (Fig. 3.62). At the moment of observation illustrated in 
Fig. 3.6a the distance between the source and the observer does 
not vary. Consequently, no Doppler effect is possible from a clas- 
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sical viewpoint. But in terms of the relativistic theory the pcriod 
To between signals in the frame K’ is the proper-time interval 
and @» == 2n/T>. Having converted To into the observer's time ac- 
cording to Eq. (3.9), i.e. T=Tov1 — B?, we obtain the equation 
ofthe transverse: Doppler eects. 

o = 71 — Ba. (3.14) 


This is the equation of the second order with respect to B. The 
transverse effect is more difficult to observe than the radial one, 








Observer 
(a) (b) 


Fig. 3.6. The derivation of the Doppler effect equations: (a) the transverse 
effect; (6) the general case. 


but still it was observed in 1938. Its discovery, as it is seen from 
the foregoing reasoning, is a direct evidence of the relativity of 
time intervals between events. It should be emphasized once more 


that the very existence of the transverse Doppler effect follows — 
n _the STR. The equation of the Doppler effect can easily 
be derived if the radiation at the angle to the direction of the 
source motion is considered (Fig. 3.65). If the first pulse is 
emitted at the point A and the second at the point B, the path 
difference of parallel rays travelling at the angle 6 to the velocity 
direction is equal to V7 cos@. It is clear from this that 


=r — rest in terms of the observer, and, consequently, 
o = =< . Note that ’ and wo in this equation are mea- 
1 — —cos8 
c 


sured in the same reference frame. Thus we make sure once 
again that there is no transverse Doppler effect (@ = n/2) in 
classical physics: w’ = wo. 

We have made sure that when two events in a certain reference 
frame occur at the same point, i.e. are single-positioned, the time 
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interval between them is defined as the proper-time interval, i.e. 
is measured by one clock, whereas the time interval between the 
same events in any other frame can be calculated according to 
Eq. (3.9). The following question arises: is it always possible to 
convert a time interval found in an arbitrary reference frame into 
a proper-time interval? It turns out that this cannot be always 
done, and the stipulation, under which this becomes possible, will 
be found in § 3.4. 

Now let us introduce the concept of the object’s proper time. 
Let an object move uniformly and rectilinearly relative to the 
frame K. The frame K’ can be fixed to the moving object. The 
object is at rest in this frame, so that events happening with this 
object or at it are registered by one clock. This clock counts the 
proper time at the point where the object is located; it can be 
said that this clock counts the object’s proper time. Eq. (3.9) 
shows in this case that the interval between events that happened 
with the object or at it, is always less in terms of the object’s 
proper time than the time interval between the same events reg- 
istered by the clocks of any IFR relative to which this object 
moves. It should not be forgotten here that the proper-time in- 
terval is registered by one clock, while the time interval in the 
frame relative to which the object moves is registered by at least 
two clocks. This is very important because in interpreting Eq. 


(3. ioqatyc ones Suet is often said to have a slower rate than a 
Stationary Such a mode ever, yer, may ~onty 
conse the situation. In fact, the clock rates_are the same in all 





renee Se 


at turns out to be different is the readings of time in- 


taresis between events. But it is only natural, because the clocks — 


synchronized in one IFR are dis-synchronized_in_another. 
- The proper time can also be introduced for a particle moving 
with acceleration. To do this, let us consider the motion of a par- 
ticle during an infinitesimal time interval. Let the particle velocity 
at a given moment be equal to V. Consider now the inertial frame 
of reference K* moving at the velocity V. In this reference frame 
the equation dt=~/1 — Prd? is valid. This equation is also approx- 
imately valid for the instantaneously co-moving particle of the 
frame K’. The frame K* differs from the instantaneously co-mov- 
ing frame K’ in that the latter, K’, moves with acceleration, while 
the former, K*, does not, although both of them move at the same 
velocity at a given moment. The less the time interval df is, the 
more applicable becomes the equation dt=-+/1 —fP* dt in the 
frame K’. Having integrated this equation, we obtain the precise 
expression in terms of +t, which is in essence the overall “proper” 
time of any coordinate system K’*. 

Proceeding from these considerations we shall suppose that if 
the interval between events that happened to the object turns out 
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to be equal to dt when measured by the clock fixed to the ref- 
erence frame co-moving with the object, the time interval between 
the same events df, measured by the clock of another IFR rela- 
tive to which the object moves, will be, according to Eq. (3.9), 


dt=ydt, 
where 
1 


v1— B’ 


now v = u(t) is the velocity of the object (and not of the ref- 
erence frame). Owing to this circumstance the designations B 
and y have been introduced. When the velocity of the object varies 
according to the equation v = u(t) the relationship between th? 
final time interval t and the time interval registered by the clock 
of the frame relative to which the object moves is obtained by in- 


tegration 
t ——— 
r—y=la/1 —(40)ar. (3.16) 


te 








y()= p=—; (3.15) 


What is the quantity that is seen on the left-hand side of Eq. 
(3.16)? Of course, it can be called the object’s proper time. But 
how to measure it? Strictly speaking, Eq. (3.9) is valid for the 
clocks in inertial frames of reference. But if the clock is fixed 
to an arbitrarily moving object, it will undergo an acceleration. 
No doubt, an acceleration affects the clock rate in a varying 
degree depending on the design of the clock. (If you do not believe 
in this, drop your clock on the floor.) Consequently, one can hardly 
speak of time readings made by means of such a clock. The rea- 
sonable interpretation of Eq. (3.16) lies in the fact that t— tp is 
the overall time measured in many inertial frames co-moving 
with the object or, which is the same, the time registered by the 
clock fixed rigidly to the object and_not affected at all by an 
acceleration of the-object. — eae 
“Tt should be stressed that the difference in readings of clocks 
from different inertial frames of reference which we obtained has 
no relation whatever to any irregularity of the clock rate in one 
or another frame. As in the case of a measurement of a rulers 
length, we deal here with different methods of time measurement. 
_The rates of all clocks in all reference frames are absolutely the 
same. The measurements of time intervals between two évents— 
performed by the two sets of clocks “from different reference 


‘Irames, synchronized within their respective frames, lead to the 


result obtained: a proper-time interval between two events turns 
out fo be always the least. ar oe so 
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Let us consider an example, which is quite analogous to that 
discussed in connection with the variation of the scale’s length, 
showing that deceleration of time is caused by different methods 
of its comparison. Let us take two quite identical clocks: A in the 
frame K and A’ in the frame K’. These can be atoms of the same 
kind. Suppose we observe the clock A’ of the frame K’, i.e. com- 
pare the clock A with the set of clocks synchronized with the 
clock A’. Then observers from K’ will discover that the clock A 
goes slower than the set of clocks from K’. On the contrary, ob- 
servers from the frame K viewing one clock A’ of the frame K’ 
will discover that it goes slower than the set of clocks from K. 
Are these results contradictory? No. We clearly see that the meth- 
“ods of clock comparison in the first and the second cases are 
different. The clock which is compared with different clocks froin 
another Teference f This amazing situation 
proves fo be inevitable. The equivalence of all IFRs underlies the 
_theory of relativity, so whén_relative Values emerge, they emerge 
in the same manner in all IFRs. - a 
_ § 3.4. The classification of intervals and the principle of causal- 
ity. It is seen from Eqs. (3.1) and (3.2) that when two arbitrary 
events are considered, both the distance and the time interval 
between them prove to be relative values: (Ax’ 4 Ax, Al’ = At). 
Until now we dealt with the events of the special type: in length 
measurements the coordinates of a ruler’s ends x. and x, were 
considered simultaneously (¢; = f2); when a time interval was 
determined, the moments of time f; and f were considered at the 
same point x;=x;. But even in those cases the spatial and tem- 
poral “distances” between events turned out to be relative. No 
wonder they do also in a general case. In addition to that, it 
follows directly from the Einstein postulates that the interval 
between events is, as we know, the invariant of the Lorentz trans- 
formation: 


Si= VC (lo — hh)? — (2 — 1)? — (Ye — 1)? — (ee — 21 = 


= +c? (At? — (Ax)? — (Ay)? — (Az)?. (3.17) 


The designations are the same as were used in the derivation of 
Eqs. (3.1) and (3.2). It is convenient to introduce also the special 
designations for the spatial and temporal distances between 
events: 

2 = (Ax)? + (Ay)? + (Az?, At=f,. (3.18) 


Having written out the squared interval between two events 
in the frame K as s?,=c,f?,— 2, and in the frame K’ as s= 
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=="? — 13, we obtain the condition for the interval invariance 
S3==s?, as 
Cr, — B= er — 12 (3.19) 


Considering the events in an arbitrary reference frame K, we 
shall most likely discover that they happened at different points 
in space and at different moments of time. 

Is it possible through the choice of the reference frame K’ to 
ensure that (a) events J and IL happen at the same point in space, 
i.e. be single-positioned; (b) events I and II happen at the same 
moment of time; and, finally, (c) events I and II happen at one 
point in space and at the same moment of time? Let us begin 
from the beginning. 

(a) Is it possible to choose such a system K’, in which these 
events will happen at the same point in space, i.e. will be single- 
positioned? This means that the following condition should be 
met: [j2==0. But then it follows from Eq. (3.19) that 


3 =c#?, — 2, =ct2>0, (3.20) 


i. e. 52,20, and the interval sj, must be real. In the frame K’ 


the events considered happen at one point in space and the time 
interval between them is equal (with an accuracy to within the 


factor c) to 
1 = 
fh, =~ et, — 2, ==2. (3.21) 
That is why the real intervals between events are referred to as 
time-like intervals. The condition for a time-like interyal can also 
“be written-tathetorm lis < Chi 


_be written-ta : 

Let us consider the motion of a particle possessing a rest mass. 
Conventional mechanics deals with objects of only this kind. 
Suppose for the sake of simplicity that this particle moves uni- 
formly along the x axis, covering the distance Ax in the time Af. 
In the frame K’ this particle will travel the distance Ax’ in the 
time At’ which is determined in accordance with Eqs. (3.1) and 
(3.2). The ratio emu is the velocity of the particle in the 
frame K. Taking this into account, we can rewrite Eqs. (3.1) and 
(3.2): 





Ax’ =0 (Ax —VAN=P Sev) At=P(v — Ve (3.22) 
’ ‘ V V Ax ere Vo 
A’ =! (a— -aar)=r(i—-@ yy )ar=r (1 ——) ae. (3.23) 
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Assuming Ax’ = 0 in Eq. (3.22), one can easily find the velocity 
of the frame K’ in which the two events in question are single- 
positioned. From the right-hand side of the equation we immedi- 
ately obtain the evident answer V = v. So, this is just the frame 
co-moving with the particle. Another important consequence fol- 
lows from Eq. (3.23). Let At = t2—?t, > 0, This implies that 
event II happened after event I. Is there such a frame K’ in which 
“AV <Q, ie. the time sequence Of the events is inversed as com- 
pared to the frame K? Eq. (3.23) shows the sign of A?’ to coincide 


with that of Af when (1 —) > 0. But this condition is always 


satisfied, since the velocity of an object v is always less than c. 
(The reference frame is also a material body.) The same condi- 
tion is also satisfied for any pair of events coupled by a time-like 
interval. Indeed, the third link of Eq. (3.23) contains the expres- 
sion (1 —tat). According to Eq. (3.18) lig > Ax, and if 
Clio > lie, c At > Ax a fortiori. This implies that the ratio Ax/c-Al 
is less than unity; V/c is always less than unity, so that 
{1 _ + =) > 0 and, consequently, Al’ > 0. 

Hence, for these two events, considered in terms of the frames K 
“and K’, the concepts-“later’’ and “earlier” have an identical, that 
is absolute, character. In general, if fhe intérval between events 
is time-like (recall that the interval is the invariant quantity), the 
_time sequence of events remains the same_throughout all IFRs. 
Later on we shall see that this is not_the case for intervals which 
differ from_time-like intervals by sign. 

What is the significance of the invariance of time sequence in 
all inertial frames of reference? We have already ascertained that 
two events separated by a time-like interval will be single-posi- 
tioned in some reference frame. If one of them happened “earlier” 
and the second “later”, the first event may be the cause for the 
origination of the second, i.e. they may be connected by the cause- 
and-effect relationship. But in this case their time sequence cannot 
depend on fhe choice of a reference frame. It is from our results 
that the criterion for the possibility of the cause-and-effect rela- 
tionship follows (the interval is time-like). As to the time sequence 
of events, it automatically remains the same in all reference 
frames. 

The time-like interval between events indicates the possibility 
of a cause-and-effect relationship between events not only because 
it provides the identical time sequence in all IFRs. It indicates 
the physical opportunity of one event affecting another. It follows 
from the inequality 0, < c7f?,, determining a time-like interval, 


that during the time passing between the two events light can 
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a forttori cover the distance from the point where event | occurred 
to the point where event II did, the product c(te—¢,) being the 
path travelled by light during the time (t2 — ¢,). This means that 
basically a certain interaction (signal) could propagate from the 
point where event | occurred to the point where event II did 
during the time interval between the events. Without claiming the 
generality in formulating the problem we shall assume that one 
event can affect another only through a physical (force) inter- 
action. Then, if event 1 happened, a “signal” about this fact can 
reach the point where event II will happen prior to the moment 
of occurrence of event I]. This means that event I can be the cause 
of event II, and event I] can be the effect of event I. In this case 
the events can have the cause-and-effect relationship. Thus, the 
events separated by a time-like interval can have the cause-and- 
effect relationship in terms of physics as well. It is understood 
that they may not be in such a relation. We only point to th 

theoretical possibility. What is essential, the time sequence can~ 
not be upset in the casé_of such_ intervals:_the- consequence -can 
“never affect its cause. 

(b) Now let us pass over to the consideration of intervals of 
the opposite sign. Let us examine again the condition of the in- 
terval invariance (Eq. (3.19)) and determine whether we can find 
such a coordinate system K’ in which the two given events I and 
I] happen simultaneously. This means that in this system f\2=0. 

Hence s?, == — 13 < 0. The squared interval between the events 


must be negative, and the interval proves to be imaginary. In the 
frame K’ the events in question happen at the same moment of 
time, and the interval between them is reduced (with an accuracy 
to within the number i) to the spatial interval [jg = isy2. That is 
why imaginary intervals are referred to_as space-like intervals. 
The condition for a Space-like-interval can also be written in the 
form lig > cbho. 

Can one find the reference frame in which At’ = 0 for two given 
events? Assuming At’ = 0, we obtain from Eq. (3.2): 

cAt 


Since one can always choose the events so that /j2= Ax, it fo!- 
lows from the condition for a space-like interval that in this case 
Ax > c-Alt. Eq. (3.24) testifies that we can get V<ce, that is, 
basically, such a frame can be chosen. The ratio Ax/c-At appears 
in the third link of Eq. (3.23); as we have mentioned, it can ex- 


ceed unity. But this means that the factor (1 —+ a) can be 
made negative by the appropriate choice of V. 
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It follows from here that the time sequence of two events 
related by a space-like interval can be reversed on transition from 
one IFR to another. This does not apply to the events which 
could have the cause-and-effect relationship. But in terms of 
physics they just cannot be so related. Indeed, the condition 
P,> c*?, signifies that no “signal” can be transmitted from the 
point where event I happens to the point where event I] does 
during the time interval between these events. Consequently, 
events separated by a space-like interval cannot be in a cause- 
and-eflect relation-——~__—~ 5% 

Thus, the special theory of relativity makes it possible to in- 
dicate thé conditions under which the cause-and-effect relationship 

_becomes_either~possible_or impossible. This is a very important 
Criterion which cannot be obtained in the general form from other 
premises. It should be emphasized once more, of course, that all 
of our reasonings are based_on the premise of the finite velocity 
of signal transmission.__ el 

(c) If we are interested in the reference frame in which the 
events would be both simultaneous and single-positioned, i.e. the 
two conditions tfy=0 and ljz—=0 would be satisfied, the two in- 
equalities sj. >>0 and #2 <0 would have to be simultaneously 
complied with. This is possible only when sz. = 0 and sjp=O0. If 
the events in question do not represent the sending and reception 
of light signals, the intervals can be equal to zero only in the 
case when the two events coincide in each of the frames K and K’. 
Of course, the coincidence of events does not depend on the choice 
of a reference frame. 

The interval between two events that happened in a given frame 
at different points in space and at different moments of time and 
whose absolute value is equal to zero, is relevantly referred to as 
a light-like interval. A light-like interval links together events 


“consisting in a light wave traversing consecutively various points 
“in space. We made sure of this at the beginning of § 2.6. 

§ 3.5. The transformation of velocity components of a particle 
on transition from one inertial frame of reference to another. From 
the Galilean transformation for coordinates and time (Eq. (1.2)) 
we obtained Eq. (1.4) showing how the particle velocity is trans- 
formed on transition from one IFR to another: v’ = v — V. This 
transformation rule does not satisfy the second postulate of Ein- 
stein, since the velocity of light in vacuo turns out to be different 
in different reference frames. The velocity transformation equa- 
tions following from the Lorentz transformation satisfy the Ein- 
stein postulates. Now we proceed to the derivation of these equa- 
tions. 
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Let us consider the particle motion in terms of the two IFRs: 
K and kK’. The velocities are determined as usual: 


In the frame K In the frame K’ 

If If 

x=x(), y=yl), z=20), | KH), Hy), 2 = 20), 

then then 

0G War 1 Gry Gr Gr 
v=. 


The relationship between x, ¢ and x’, ¢t’ is meant to be estab- 
lished via the Lorentz transformation, so that ¢, for example, can 
be adopted as an independent variable. When ¢ varies by df, all 
variables get increments; differentiating the Lorentz transforma- 
tion (see Eq. (2.16)), we obtain these increments in terms of 
differentials: 


dx=T(dx’+Vdt), dy=dy’, dz=dz’, 
dt=P(d’'+2 ax’). (3.25) 


Having divided termwise the first three equations of Eq. (3.25) 
by the last one, we get 


dx dx'+Vd dy | dy’ 
OE” apa get gp (a a dx’) 
dz dz’ 


a (a +—=ax’) 


Dividing the numerator and denominator of the right-hand sides 
of these equations by d?’, we finally obtain 





, y? t y2 
o,+V afi a eafi- 
ey ee (3.26) 
* Vi, y ’ x 
1+ % 1+ Fo er § 


These are the formulae of the relativistic transformation of velo- 
cities. Using them, the components vx, Vy, Uz in the frame K can 
be found from the velocity components v%, vy, v2 in the frame K’ 
and the velocity of the frame K relative to K’ which is equal 
to —V. In order to get the formulae for the inverse transforma- 
tion, it is necessary to change the sign of the velocity V, to prime 
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the unprimed quantities and to withdraw primes from the primed 


ones: 
, vy —V , __ 9 V1—B ,__ oz Vi —B 3.97 
ep ee  .n.oe7) 
1—B— 1— B= (Betz 
c c c 


Of course, the same result will be obtained from Eq. (3.26) 
directly. It is seen from Eqs. (3.26) and (3.27) that a uniform 
motion in one IFR_ will be uniform in all other IFRs. Hence, the 
-uniformi and rectilinear motion is distinguished from all other 
kinds of motion. On the contrary, according to relativistic kine- 
matics the uniformly accelerated motion in a certain IFR may 
“not be that in other IFRs (see § 5.1). 

In Eqs. (3.26) and (3.27) the x axis is distinguished from the 
y and z axes. This is only because the relative velocity of the 
reference frames K and K’ is directed along the x axis. Passing 
to the limit B—0, i.e. assuming formally c— oo in these equa- 
tions, we get back to the Galilean transformation (Eq. (1.4)). 
This means that the Galilean transformation is sufficiently ac- 
curate if the relative velocity of the reference frames is slow 
“compared to that of light. And hence we do not resort to rela- 
tivistic concepts in our everyday life. Here we have learned again 
that the difference between relativistic and classical concepts is 
due to the finiteness of the velocity of light. Note that Eqs. (3.26) 
and (3.27) are casually derived using a four-dimensional approach 
to the theory of relativity. 

Let us consider the motion along the x’ axis. In this case the 
velocity components in the reference frame K’ will be uv, =v’, 
ty =0, v2=0. From Eq. (3.26) we see that in the frame K the 
components uv, and vz are equal to zero. Consequently, the motion 
in the frame K also takes place along the x axis and uv, = v. 
That is why according to Eq. (3.26) 


ya (3.28) 
1 + PA 


Setting v’ = c, we get v = c from Eq. (3.28). This corresponds 
to the second postulate of Einstein: the velocity of light in vacuo 
is the same in all IFRs. 

Note here that the substitution v’=c in Eq. (3.28) is not 
quite consistent, since material particles representing a “signal” 
cannot move at the velocity c, the equation being derived for ma- 
terial particles (m= 0). However, one may assume v’ =c in 
Eq. (3.28) considering light quanta (photons) as relativistic par- 
ticles (§ 7.6). Besides, there are ultra-relativistic particles whose 
velocity is close to that of light. For example, the velocity of elec- 
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trons in the electron accelerator in Erevan differs from c in the 
eighth (!) decimal point. 

We shall now give the explanation of the results of the Fizeau 
experiment as another example of utilizing Eq. (3.28) for the 
velocity summation. In this experiment the velocity of light pro- 
pagating in water motionless relative to an observer (laboratory) 
was compared with that in water moving at the velocity V. The 


| Water 


Reverse 
prism 





Fig. 3.7. The schematic drawing of the Fizeau experiment. A semi-transparent 
plate PP spiits a jight beam from a source into two light pencils, one going 
along and another against the water flow. Fresnel, who studied light propaga- 
tion in moving media, anticipated the velocity of light v = v’ + RV for the ob- 
server relative to which water moves at the velocity V, with the velocity of 
light in motionless water being equal to v’; here the sign “++” corresponds to 
light propagation along the water flow, and the sign “—” to light propagation 
against the water flow; the coefficient & is called the drag coefficient. This coef- 
ficient was sought by Fizeau who conducted the described experiment. It fol- 
lowed from the Fizeau experiment that & = i1—i/n? The theory of relativily 
explains quite naturally Fizeau’s resu!t (see the text). The details of the expeni- 
ment can be found in books [13] and [15]. 


velocity of light in motionless water is equal to c/n, where n is 
the refraction index of water. The light pencil travelled through 
the moving water (Fig. 3.7), and its velocity was determined in 
the laboratory frame of reference from an interference of two light 
pencils going one along and another against the water flow. It 
followed from the results obtained in the Fizeau experiment that 
the phase velocity of light in motionless water should be in- 
creased by the velocity of water V multiplied by (1 — 1/n?). Thus, 
if the phase velocity of light in motionless water is v’ = c/n, the 
phase velocity, found in the laboratory frame, turns out to be 


v= =+V (I —=). (3.29) 
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Using Eq. (3.28), we conclude that in the laboratory frame (due to 
the velocity summation law) 
c 


oie hapee i-(4/) 


Neglecting the quantity (V/cn)? in the denominator whose small- 
ness is due to the non-relativistic velocity of water ((V/c)? < 1), 
we obtain 
c | v? c V 1 v2 

vei 4+v(1—-4)—-—==[14+2(1 -z)2-S]. 30) 
In Eq. (3.30) we again neglect the term V?/c? and get the Fizeau 
result (Eq. (3.29)). Thus the Einstein equation for the velocily 
transformation provides a natural interpretation of the results 
of the Fizeau experiment (see § 1.7). 

We have already pointed out that the most important assump- 
tion of contemporary physics is the statement concerning im- 
possibility of transmitting signals (interactions) at the velocity 
exceeding that of light. No doubt, a moving object can be used fo 
transmit a signal (energy, momentum), and, consequently, the 
velocity of an object cannot exceed c. Relativistic mechanics in- 
fers that the velocity of a material object, ie. an object possessing 

rest mass, is always less than c and never reaches this value. 
But this is valid in a definite IFR>Js_it possible to choose such 
an IFR_in which the velocity of an object will exceed c? ~ es 

Had it followed from classical mechanics That in a given IFR 
the velocity of an object never exceeds c, one would have ob- 
tained the velocity of an object exceeding c via a choice of a suitable 
reference frame. Indeed, according to Eq. (1.4) v = v’ + V, where 
v’ is the velocity of the object relative to the frame K’ and V is 
the relative velocity of the frames K and K’. If the velocities v’ 
and V exceed 0.5 c, the velocity v of the object in the frame K 
will be greater than c. 

But in the STR the velocity transformation is carried out 
differently. It is seen from Eqs. (3.26) and (3.27) that the ve- 
locities of a particle and of a frame do not add up as vectors do. 
Moreover, the velocity summation in the STR obeys the incredible 
trulectc=c. 

It follows from Eqs. (3.26) and (3.27) that if the velocity of a 
parlicle is less than c in the frame K (v/c <1), and the velocity 
of the frame K’ relative to K is also less than c (V/c <1), the 
velocity of this particle determined in the frame K’ is always less 
than c. The simplest demonstration of this statement can be car- 
ried out for the case of a unidimensional motion by means of Eq. 
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(3.28). Having composed the expression (v/c — 1), we write out 
the following chain of equations: 








ee ee 
al -e)-(-s) OO) (\- We) <0. (3.31) 
1+ “ 








Whence it is clear that vu < c. 

But is it possible to get the relative velocity of reference frames 
exceeding c by means of consecutive transitions from one frame 
to another? Strictly speaking, a reference frame is a system of 
material objects, so that in order to answer the question we can 
make use of the theorem just formulated. Certainly, in the STR 
one cannot obtain the relative velocity of frames exceeding ¢ in 
any case. But now we shall derive this result once again by 
another method, which is instructive by itself. 

Let us introduce, aside from the frame K, two more frames, K’ 
and K’’. What is the relative velocity of the frames K and K” if, 
on the one hand, the relative velocity of the frames K’ and K”, 
and, on the other hand, that of the frames K and K’ are known? 
Let the relative velocity of K and K’ be equal to V and that of K’ 
and K’ be equal to W. Introducing the designations B, = V/c 


and Bz = W/c and, correspondingly, 1/fy=V/1—B, 1/P,.= 
=V] — B3, we get 
r= tV), tn (vt), (3.32) 


Tae +00), P(e + =x"). (3.33) 





Substituting Eq. (3.33) into Eq. (3.32), we find the explicit rela- 
tionship between coordinates and time in the frames K and K” 


x= rir. (x” + wr" + Vi" + B,B,x”) —_ 
=P P, {(1 + ByB,) x” + (V+ W) I= 


op VW 
=MP2(1+ BB) (x"+7p 55 t")- (3.34) 
In much the same way one can obtain 


ni VtW, 
t=TP2(1-+ BB) ("+25 5g 2”): (3.35) 
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Designating 
oe = Lieu, 6.30 
1Be + VW/c 


we shall calculate the first multiplier in Eqs. (3.34) and (3.35): 
PP, (1 + B,B,) = 


(1 — Bi) (1 — Bs) Aj marae L + BPB3 + 2B,B, — (B?+ B+ 2B,B,) 
(1 + Bi Be)? (1 + B, Be)? 


l 
Seay 

V-(FaE)  /-= 
Then from Eqs. (3.34) and (3.35) one can get 


U 
a” a” 
t to x 


=e ee 
‘|\- ie 


Consequently, the two successive Lorentz transformations of 
the relative velocities V_and W_ of the reference frames are equi- 


“valent to one transformation of the relative velocity U ‘determined 
according to Eq (336}-In other words, the relative velocities of 
reference frames,“ ” ding to Eq. (3.28). But we 
have already argued that such a summation will not produce a 
velocity greater than that of light. 

Eq. (3.36) can readily be obtained by means of a complex ro- 
tation (see § 2.8). In geometrical terms the transition from 
to K’ and then from K’ to K” constitutes a consecutive rotation 
in the plane (x, t) through the angles gq, and q2 with tang, = iB, 
and tan ge = (Bo. 

The tangent of the resulting angle can be found according to 
the conventional formula for the tangent of the sum of two angles 


(p =o + @): 








___ tang, + tang 
tang= 1—tang,- tang: ’ 
or 
7 Bi +iB, 
Bn 1+ BB; * 


which is just Eq. (3.36) with B, B,; and Bz being replaced by their 
respective values. The two last expressions show that the set of 
the Lorentz transformations possesses the basic property of the 
group (from the standpoint of the mathematical group theory}? 
two Lorentz transformations again produce a Lorentz transforma: 
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tion. It is essential here, however, that the relative velocity is 
always directed along the x axis. 

Here is the useful interpretation of Eq. (3.28). In § 2.7 we in- 
troduced the parameter @ associated with the relative velocity of 


reference frames by the relationship B = — tanh®@. We can also 
introduce the velocity parameter for a particle B = — tanh 8. Then 
Eq. (3.28) will take the following form: 

=. _ p+B _ tanh@’+tanhO , . 


the last link of the equation is written in accordance with the 
equations of Appendix I, § 9. This is an interesting result._In the, 
classical theory, it is velocities that add up (Eq. (2.4)), while 
in the relativistic. theory velocity parameters do. The last con- 
tingency will be put to use in § 5.7. 

§ 3.6. The transformation of an absolute value and the direction 
of the velocity of a particle. From Eq. (3.26) for the velocity com- 
ponent transformation one can obtain expressions determining 
the absolute value and direction of the velocity in the frame K, 
provided the velocity components in the frame K’ are known. 
First of all, it is evident that if the velocity component v;=0 in 
the frame K’,-the component vz = 0 in the frame K too. This 
means that if the motion in the frame K’ takes place in the plane 
(x’, y), the motion in the frame K will also take place in the 
plane (x, y). Let us choose the x’ and y’ axes so that the velocity 
of a particle lies in the plane (x’, y’) of the frame K’. Then it is 
clear that if 6’ is the angle between the direction of the velocity 
v’ and the x’ axis then v; =v’ cos@’, vy =v’ sin®’. We shall de- 
note the angle between the direction of the velocity uv and the x 
axis in the frame K by 8. Consequently, v, = ucos 0, v, = usin 0. 
Let us find the equations relating v and 6 with v’ and 6’. Having 
expressed the components v; and uv, in terms of v’ and 0’, we 
can rewrite the first two formulae of Eq. (3.26) in the following 
form: 
aD ae cose ty , vsind= 
TC ar ae B 1+ 


v’ sin 0’ 4/1 — Bt 


uv cos § = vcs 0 (3.38) 


Having divided the second formula by the first one, we obtain 
the expression for tan 6: 


vo’ 4/1 — B? sin 0’ ; (3.39) 


tan6 = v’ cos 0’ +4 V 


In order to find the expression for the absolute value of the ve- 
locity, it is sufficient to square and sum termwise Eq. (3.38); we 
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shall obtain at once 


1 
, 2 ‘ 2 
o__ v'?+V2-+ 2v’V cos 0’ — vB? sin?’ (o’ + V) c? [o°V] 


a (14 vices B) _ (14 o'V 


c? 


. (3.40) 





It follows from Eqs. (3.39) and (340) that the angle between 
the velocity direction and the corresponding x axis, as well as the 
absolute value of the velocity, change on transition from the frame 
K’ to the frame K (Fig. 3.8a). (Recall that the geometric axes x 
and x’ coincide.) Of course, the same occurs in classical me- 
chanics as well, although it is described by other equations. 

Now let us derive a useful formula resulting from Eq. (3.40). 
We shall need it when studying Chapter 7. Using the first equa- 
tion of (3.40), we compose the following expression: 





21,72 
ass: c? (: + =F 00s 0" y — 0’? — VY? — 2Qv’V cos 8’ + aes sin? 6’ 








Consequently, 


eA tes 
2 aa a ar i 
a/i- Ca G ae (3.41) 








or 
rp Vi- pt vi 
I —P = FB cos8 (3.42) 


From Eq. (3.41) it is also easy to obtain a convenient equation 
for the square of the absolute value of the poe 


(-F)(-8) 
0+ 


from which it immediately follows that if u’/e and V/e are less 

than unity, then v<c. For the special case, when the velocity 

was determined according to Eq. (3.28), the same theorem was 

demonstrated above. Eq. (3.43) can be obtained, of course, by 

Saree ee summing up the left-hand and right-hand sides of 
q. (3.26). 


v= cr — (3.43) 
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From Eqs. (3.38) and (3.39) one can readily obtain the equa- 
tions determining the variation of the direction of light rays on 
transition from the frame K’ to the frame K. In this case, having 
assumed v’ = c in Eq. (3.43), we obtain, as it should be expected, 





Fig. 3.8. (a) A particle moves in the plane (x’, y’) in the frame K’. The incli- 
nation angle of ils velocity to the x’ axis is equal to 0’, with tan 0’ = v’,/v’s. 
In the frame K the components v, and vx vary according to Eq (326), whence it 
is clear that the angle 8 is not longer equal to 0’ (see also Eq (339)). For 
the case shown in the diagram, 0’ > 0. (6) Light propagates along the y’ axis 
in the frame K’, i. e. along the perpendicular to the frame motion direction. Ob- 
viously, vu’, = 0, v’'y =c¢, 0’ = 2/2. In the frame K according to Eq (326), 
vy = V, 0, = ¢c-+/i — B*, whence tan0 = (/T— RB. The aberration angle is 
formed by the visible direction of incoming light in the frame K and the direc- 
tion of light in the frame K’, ie. the y axis. The aberration angle a = x/2 — 0. 


v =v’ =. Taking this into account, we obtain from Eqs. (3.39) 
and (3.38) respectively 
vVi-B 


tan 8 =B+coso” sin 8 ’ (3.44) 
- 7 WVI—B |. a, _ _cos0°+B 
sin 8 = T+ Bos" sin 6’, cos 8 = 7 ics 6" (3.45) 


Eqs. (3.45) describe the light aberration consisting in the wave 
front of a light wave changing its direction on transition from 
one IFR to another. Let light propagate in the frame K’ along 
the perpendicular to the frame motion direction, for example, 
along the y’ axis (Fig. 3.86); this means that 6’ = n/2. Then, ac- 


cording to Eq. (3.44) tan @ = I= 


_The aberration angle is constituted by the visible directions of 
light in the two IFRs. In the frame K’ light propagates along the 
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y’ axis, while in the frame K at the angle a = n/2 —6 to the y 
axis. Obviously, the angle @ is the aberration angle, and 
m B 
tana =tan (= 8) cot6 Vo’ 

The calculation of the aberration angle @ according to the Ga- 
lilean transformation (do the calculation yourself!) gives tana= 
=B. This means that the relativistic formula differs from the 
non-relativistic one by the term of the order of B?. 

One can also readily obtain the equations for the aberration 
angle in the case when light falls at an arbitrary angle to the 
motion direction. We shall assume that B < 1. Then, according 
to Eq. (3.45), 


sin = (I —+B+ Jd —Bcos@ + ...) sind’. 


Rejecting all the terms starting from B? and higher, we obtain 
sin 8 — sin 8’ = — Bsin @ cos6’, 


The angle 6’ — 6 = A@ is the aberration angle. Since the right 
part, proportional to B, is small, A@ is small as well: 
6+ 0’ a 6 — 0’ 


sin 6 — sin 6’ = 2cos 5 in-> = — cos 6’ Ad. 





Consequently, 
Aé = B sin@’. (3.46) 


This elementary equation describes the aberration of light falling 
at the angle 0’ in the frame K’. Fig. 3.8 illustrates the change of 
the direction of particle velocity on transition from the frame K’ 
to K, as well as the calculation of an aberration angle in the case 
of a perpendicular (relative to the motion) incidence of light. Sec 
Supplement II about the role that the aberration played in the de- 
velopment of the STR. 

Finally, let us calculate the relative velocity of two particles. 
It is natural to define the relative velocity of two particles as the 
velocity of one of them in the frame K in which another particle 
is at rest. Let the velocities of the particles in the frame K’ be 
vj; and v3. Choose the coordinate system K such that V= — 03, 
The particle velocities are immediately determined from Eq. (3.40). 
The absolute value of the velocity 0, is equal to zero, while that 


of the first particle is 
, r 1 ae 
(0; — %)’ — o [ojo] 
v= : 


1 Ge) (3.47) 
+ 
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This expression defines the square of the relative velocity of the 
two particles. Eq. (3.47) is symmetric relative to v, and vp. 

It was shown (see Eq. (3.43)) that uv? is always less than c? 
in Eq. (3.40). This is also the case for Eq. (3.47): the relative 
velocity of particles cannot exceed the velocity of light in vacuv. 

§-3.7. The K calculus (the radar method). We shall present 
below an elegant method of deriving the basic consequences of 
the Einstein postulates. This method could be briefly presented on 
the basis of the results obtained earlier by other methods. How- 
ever, we intend to reiterate some conclusions in order to make 
this section more or less self-consistent. The method is remarkable 
because it dispenses with the coordinate routine, even in the 
derivation of the Lorentz transformation. Although graphical illu- 
strations used below involve coordinates, they have an auxiliary 
character: these coordinates are not indispensable for the pre- 
sentation of the method but are useful for those who are familiar 
with space-time diagrams. 

Only one spatial coordinate is to be considered. Many charac- 
teristics of the STR are revealed even in this case permitting of 
descriptive illustrations. Thus, let all events occur on the x axis 
(and respectively on the coincident x’ axis of the frame K’, see 
Fig. 1.2). In the K calculus all conclusions are drawn from the 
imaginary experiments consisting primarily in the exchange of 
light signals in vacuo; their sending, reflection and reception are 
examined. In the final analysis, such a play with light spots makes 
it possible to obtain the basic consequences of the Einstein postu- 
lates. 

The principal assumption which is made in the K calculus is 
based on the Doppler effect (see § 3.3); in a unidimensional case 
it is always the radial Doppler effect. Thus, if a stationary radar 
located in the frame K emits short pulses periodically with time 
intervals (periods) 7, an observer in the frame K’ moving away 
from this radar at constant velocity will discover that the in- 
terval between the incoming light pulses is different, despite the 
fact that the rates of the clock fixed to the radar and that of the 
observer from the frame K’ are identical. 

For the sake of simplicity we shall speak not of the radar and 
receiver, but of the two observers A and A’ at rest in the reference 
frames K and K’ respectively. Thus, if the observer A sends light 
signals separated by the time interval T according to his clock, 
the observer A’ will receive these signals separated by a different 
interval as measured by his own clock. Let us designate this in- 
terval by KT. That is how the coefficient K appears, the key quan- 
tily of the considered method. 

It should be pointed out that T and KT are the time intervals 
between the sending of the first and of the second signals by the 
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observer A and the reception of these signals by the observer A’, 
measured in each case by the clocks at rest in the frames K and 
K’ respectively. 

Proceeding from the principal properties of space and time, 
their uniformity and isotropy, one can assume that the coefficient 
K depends neither on the positions of the receiver and the source, 
nor on the time of sending and receiving the signal, nor on the 
direction in which the signa! is sent (in other words, the direc- 
tion of the common x, x’ axis may be chosen arbitrarily in space). 
Certainly, this coefficient does not depend on the time interval 
between the sendings of the signals. It may depend only on the 
relative velocity of the observers A and A’. Indeed, as the expe- 
rience shows, the variation of the light frequency due to the 
Doppler effect depends only on the velocity of the relative motion. 

The reason for the appearance of the coefficient K is evident. 
Let the observer A located at the origin of the reference frame K 
send light signals to the observer A’ located at the origin of the 
frame K’. The frame K’ moves away from the frame K to the 
right. Let the first signal be sent at the moment of time ¢t. Then 
it is easy to determine the moment 1; by the clock of the observer 
A, when the observer A’ receives this signal. Indeed, the signal 
propagating at the velocity c has to travel during the time 1, the 
distance Vt which separated the observers A and A’ at the mo- 
ment ¢ and the distance Vx, which will be covered by the observer 
A’ during the time tj: ct: = Vi+ Vu, whence it follows that 
u= 77 t, The second signal is emitted at the moment ¢+T 
and it reaches A’ in the time t, determined from the equation 
ct, = V(t+T)+ Vte. Consequently, t. = fer?. The differ- 
ence T,— | =+—7T gives the time interval between the signals 


received by the observer A’. However, we have not yet found the 
expression for the coefficient K, although it may seem so. The 
coefficient K will be obtained as soon as we find the time interval 
between the incoming signals registered by the clock of the ob- 
server A’, But we have not determined the relationship between 
the readings of the clocks A and A’ so far. 

Until now we put to use only the uniformity and isotropy of 
time and space. Now we shall make use of the constancy of the 
velocity of light in vacuo in all IFRs. We shall have to use this 
property of light very often. This condition can be formulated as 
follows: “light cannot overtake light.” Now we pass over to the 
problem which employs explicitly the equivalence of all inertial 
observers, i.e. the first postulate of Einstein. 

We have agreed that the signals sent by the observer A at the 
intervals 7 will be received by the observer A’ at the intervals KT 
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as measured by his clock. Due to the equivalence of the observers 
we have to suppose that the signals sent by the observer A’ at the 
intervals T will be received by the observer A at the intervals KT 
as well. (The principle of relativity for two inertial observers A 
and A’.) 

It is worth mentioning that this assumption is strongly based 
on the fact that vacuum contains no medium in which light pro- 
pagates. Had such medium existed, the coefficient K would have 
depended on the velocities of the observers A and A’ relative to 
this medium. It was just such a medium (ether) that agitated the 
minds of the 19th century physicists most of all. It caused a series 
of dramatic situations preceding the advent of the STR (see Sup- 
plement II). At present it is quite reasonable to adopt the contem- 
porary point of view. 

Now we shal! find the explicit expression of the coefficient K in 
terms of a relative motion. In this procedure we shal! need noth- 
ing except a few imaginary experiments pertaining to the send- 
ing, reflection and reception of light signals. The reflection can 
be treated, if necessary, as the sending of the signals by the “ob- 
server” in the reverse direction at the moment when he receives 
the incoming signal. 

Let the first signal from the observer A to the observer A’ he 
sent at the moment when the frames K and K’ coincide. The ob- 
servers A and A’ located at the origins of their respective frames 
are positioned at this moment at the same point in space. Na- 
turally, the transmission of this signal from A to A’ and of the 
reverse signal from A’ to A does not require any time. After the 
time interval T by his clock the observer A sends a light signal 
to the observer A’ who wil! receive it in the time interval KT after 
the reception of the first signal. Let the observer A’ send a signal 
back to A immediately on the reception of the second signa! (the 
same as the mirror reflection). The two signals are separated by 
the time interval KT by the clock of the observer A’. Hence, the 
return signal will be sent from A’ to A after this time interval. 
But the observer A will not receive it after the time interval KT. 
This time the interval will be increased K times again and will 
be equal to K?7. Consequently, the return signa! will be received 
at the moment K?T by the clock of the observer A. Hence, in terms 
of the observer A the total travel of the second signa! sent at the 
moment 7 to the observer A’ and back takes the time K?T — T = 
= (K?—1)T. Since the velocity of light is the same whether it 
propagates in the direct or the opposite direction, the propagation 
time from A to A’ (or back) is equal! to '/.(K? —1)7. From this 
it follows that the determination of the distance between A and 
A’ at the moment of reflection by means of a radar will give the 
value '/.(K? — 1) Te. 


108 Special Theory of Relativity 


Thus, we have found the distance between the observers A 
and A’ at the moment when the signal is reflected. But at what 
moment by the clock of A did the reflection occur? Note that we 
speak of the clock located at A, while the event that we consider, 
i.e. the reflection of the signal at A’, is removed from A. In this 
case we cannot measure the time of the event directly but have 
to ascribe a definite moment of time to it. 

The second light signal was sent at the moment 7 and was 
received back at the moment K?7. Hence, the moment of reflection 
is determined as '/o(T + K?T) = '/2(K? + 1) T. Consequently, dur- 
ing the time interval '/.(K?-+1) T the observer A’ moves away 
from lhe observer A by the distance '/.(K?— 1) Tc. So, the rela- 
tive velocity of the observer A’ is 














1 
i pnp " eee ae (3.48) 
z(KP+1)T 
It therefore follows * that 
K= a/ cea (3.49) 
Here we shall write the two equations which we shall need later: 
Kola iger Ketan, aso 


It is very convenient to make use of a graphical diagram in 
order to present descriptively the results obtained. Let us intro- 
duce the Cartesian coordinate system on the plane with coordinate 
axes x and ct. Later we shall see that the choice of the spatial 
and time coordinates of the same dimension is downright inevi- 
table, but for the present we shall be marking 1 along the axis of 
ordinates which is proportional to time: t = ct. The x and t+ axes 
are drawn in Fig. 3.9. Every point of the plane represents the 
event defined by the coordinates (x, t). The motion of a body is 
a sequence of events consisting in the arrival of this body at a 
given point at a given moment of time; it is depicted as a curve 
in the plane (x, 1). 

The uniform motion of an object is depicted in this plane by a 
straight line. The propagation of a light beam at the velocity c is 
depicted by a bisecting line (the equation x = 1) running through 
quadrants | and III when light propagates in the positive direc- 
tion of the x axis and through quadrants II and IV when light 


* Later we shall see that this equation determines the change of a light [re- 
quency on reflection from a moving mirror (see § 7.5), 
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travels in the opposite direction. Since the velocity of an object 
is always less than that of light, the uniform motion of any object 
is depicted by a straight line forming an angle less than 2/4 with 
the t axis. 

It is easy to find the points on the plane which depict the mo- 
tion of the observers A and A’. In the frame K shown in Fig. 3.9 
the observer A is at rest; we shall suppose that he is located at the 
point x = 0. Then his “world 














line’, i.e. the succession of A Ky 
points in the plane (x, t) cor- ' AS $§ 
responding to the events con- 77 Viprig pine of the NN & 
sisting in his being at a given observer A ¢ ¥ 
point at a given moment of x) 


time, will be represented by the , 
axis of ordinates. Hence, the * 
axis of ordinates is the world 
line of the observer A. The §& 
world line of the observer A’ in ¥} | 4, 
the frame K is represented by 
the straight line inclined to the 
t axis at the angle a whose ll A 
tangent is determined by the 
ratio tana = x/t = x/et = v/e. & 
If at the moment ¢ = 0 the ob- os Mes 
servers A and A’ were located 





at one point, the world line of 
the observer A’ passed through 
the origin O. The sending and 


Fig. 3.9. The graphic illustration of the 
determination of the relative velocity 
of two observers. 


reception of light signals by the 

Observer A is depicted in the plot (x, t) as follows. The first 
“exchange” of signals takes place at the point O. Then, after 
the time interval T (at the world point A,), the observer A 
sends a light signal. Its propagation is described by the straight 
line A,Aj parallel to the bisecting line. The observer A’ will re- 
ceive the light signal at the world point Aj. The propagation of 
the light signal sent by the observer A’ in the reverse direction 
is depicted by the straight line A{Ae parallel to the bisecting line 
of quadrants II! and IV not shown in Fig. 3.9. The observer 4 
will receive the return signal at the world point Ap. According to 
the condition OA, = 7 * and, from the definition of the coefficient 
K, OA, = K?T. In the frame K the point A{ is associated with 
the moment of time (by the clock located at the point x = 0, ive. 
at the observer A) A3. Obviously, OA3 = '/2(OA; + OA) = 





_* In the coordinates t one should write OA, == cT, Lut for the sake of sim- 
Plicity we shall not do this. 
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= '/.(K?-+ 1) T. The propagation time of the second signal from 
A to A’ is, naturally, '/2(OA2 — OA;) = '/2(K? — 1) T. 

Next, we shall note a useful theorem of the K calculus. It is 
seen from Eq. (3.49) that the change of the sign of the relative 
velocity, ie. of the quantity B, transforms the quantity K into 
1/K. This means that the receding and the approaching at the 
same absolute value of the velocity correspond to reciprocal values 
of the coefficient K. 

Let us now consider the case when there are three reference 
frames K, K’ and K” and three observers located at the corres- 
ponding origins O, O’ and O’. Let the coefficient K be equal ‘o 
K(A, A’) for the observers A and A’; it depends only on the rel- 
ative velocity of the frames K and K’ which we shall designate 
by V as before. If the relative velocity of the observers A’ and A” 
is equal to W, the coefficient K for these observers, K(A’, A”), 
depends only on W. Is it possible to find K(A, A”) when K(A, A’) 
and K(A’, A”) are known? Let us derive the requisite equation. 

Let the observer A send two light signals separated by the time 
interval T registered by his clock. The observer A’ receiving these 
signals will find that they come in separated by the time interval 
K(A, A’) T as it follows from the definition of the coefficient K. 
But this time is registered by the clock of the observer A’. The 
observer A” is located further from A than the observer A’, so 
that the signals passing the observer A’ go onward to A”. At the 
moment when A’ receives the first signal from A, he sends a light 
signal himself without delay to A” (do not worry: it is an 
imaginary experiment!). Now two signals propagate toward the 
observer A”: one travelling from A and another sent by the ob- 
server A’. Since both of them are light signals, they propagate at 
the same velocity, having left A’ at the same moment. In fact, 
they propagate as one signal. 

The same procedure is repeated by the observer A’ at the mo- 
ment when the second signal from A comes in. And again one 
signal propagates from A’ to A”, consisting of two light pulses 
sent from A and from A’. , 

The observer A” will receive the two signals. On the one hand, 
according to the definition he will register by his clock that the 
time interval between the signals is equal to K(A, A”) T. On the 
other hand, these signals were sent by the observer A’ with the 
time interval between them K(A, A’) T. According to the defini- 
tion the observer A” will find that the time interval between these 
signals is equal to K(A’, A”) -K(A, A’) T. But the signals from A, 
and A’ arrive at A” simultaneously, so that 


K(A, A") =K(A, A)+K(A, A”). (3.51) 
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The result is remarkably simple. Knowing the coefficients K 
for two pairs of reference frames in which one common frame is 
contained, one can obtain the unknown coefficient K pertaining to 
the last pair of frames by multiplication of the known coeffi- 
cients K. 

Graphically this result is readily obtained from Fig. 3.10. Here 
the world lines of the three observers A, A’ and A” are depicted. 
All the observers were at the same point O at the moment ¢t = 0. 
The time interval 7 later, the ob- 
server A sends from the world ' A" qe 
point A, a light signal whose 
world line is depicted by a dotted 
straight line A\A{A;. According to 
the condition, OA,=T and by the 
definition OA{j = K (A, A’‘)T, OA{= 






x “4 
=K(A, A”)T. Onthe other hand, 8& AN 
evidently OAY=K(A’, A”)-K(A, ox ANS 

Ai)T. Note that the proposed & 

diagram is suitable only for a & 

graphical depiction of “imaginary 4, 


experiments” but cannot be used 
for a geometric determination of 
various quantities. The plane 
(x, t) is not just a conventional 
Euclidean plane (see Chapter 4). 
However, combining a graphical 0 cd 
geometric description with alge- Fig. 3.10. The derivation of Eq. 
braic determinations, we shall not (3.51). 

make a mistake. 

It is easy to find the equation for the transformation of velocities 
of coordinate systems. Suppose we want to find the relative ve- 
locity U of the frames K and K” if the relative velocity of the 
frames K and K’, designated by V, and that of the frames K’ and 
kK", designated by W, are known. 

Introducing the familiar designations V/c = B, and W/c = By, 
we obtain from Eqs. (3.48), (3.49) and (3.51) 


U K?—1 K?(A, A’)- K?(A’, A”) — 1 B, + By 


4 


Going back to conventional designations, we obtain the equa- 
tion for the velocity transformation (Eq. (3.28)): 
_ V4+w 
U= 14+ VW/c? * 


It is seen from the reasoning quoted that the moments of 
occurrence of events and the time intervals between them prove 
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to be different for observers from different IFRs. To detect this, 
let us return to the experiment analysed earlier and involving 
an exchange of light signals between the observers A and A’. We 
shall recall that the first “exchange” is performed at the moment 
when the observers are located at one point. At that very moment 
the clocks of the observers A and A’ are set to the zero reading. 
Then after the time interval! 7 by his clock the observer A sends 
a signal directed to A’; according to the definition, the time in- 
terval separating the reception of the first and the second signals 
by the observer A’ is equal to KT by his clock. However, the ob- 
server A will ascribe the moment of time '/.(K?-+1) 7 to the 
reception of the signal at A’ and will assume that the signals 
sent by him at the intervals 7 will reach A’ with the intervais 
‘/o(K? +1) 7. As it was mentioned, the same interval in terms 
of the clock A’ is equal to K7. Hence, the time interval between 
ihe two identical events, the arrival of the first and the second 
signal at A’, proves to be different: in terms of A’ it is equa! ‘o 
KT and in terms of A it is equal to '/.(K? +1) 7. Thus, we dis- 
covered that the time of the event. i.e. the arrival of the second 
signal, is relative: it is equal to KT in terms of A’ and 
'fo(K? +1) T in terms of A. The time interval! between the two 
events proved to be different for A and A’ too._All this indicates 
that the time of an event as well as the time interval between 
“events are relative values. ~ 

‘Under what conditions will these values coincide? It happens 
when KT = '/o(K? +1) T. It can readily be inferred that it is 
possible when K = 1 or, as it is seen from Eq. (3.48), when 
V/c—-0. Thus, the difference in time readings and the relativity 
of time intervals between events can be neglected in those IFRs 
whose relative velocities are small compared to that of light. 

The proper time. The K calculus makes it possible to determine 
readily a relationship of a time interval between two events that 
occur in a certain IFR at one point in space and are, consequently, 
registered by one clock (the proper-time interval), and a time 
interval between the same events registered by two clocks of 
another IFR in which the considered events occur at different 
points. 

Now let us go back to the exchange of light spots. If A sends 
signals at the interval 7 by his clock, A’ receives them at the in- 
terval KT by his clock. However, as we saw before (p. 108), this 
interval is equal to '/.(K?-+ 1) 7 in terms of A. It is the ratio of 
these quantities that gives the relationship between the proper- 
time interval \r = KT and the time interval A¢ registered by two 
clocks of another IFR. This ratio is equal to 


At KT _ 2K =a/ v? 
At 4/2(K?+1)T + K?+1 — (rs 
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where in the last link Eq. (3.48) is used. This result is of course 
familiar to us. 

Relativity of rulers’ lengths (distances). Suppose we have two 
motionless points in the reference frame where the observer 1’ 
is at rest. One may presume, although it is far from being oblig- 
atory, that these points are a ruler’s ends. Let the ruler move 
from the observer A and the observer A’ be located at the end of 
the ruler which is nearer to A. (Do not forget that the ruler is 
oriented along the direction of the relative velocity.) 

To determine the length of the ruler, the observer A sends a 
signal at the moment ¢,, registered by his clock, and waits for it 
to return after reflection from the far end of the ruler. Let the 
moment of the signal return be ¢, by the clock of A. Obviously, 
the moment of the signal reflection is equal to '/2(¢; + ¢4). Exactly 
in the same manner, a signal can be sent to the near end of the 
ruler (say, at the moment ¢)) and the moment of its return deter- 
mined (for example, ¢3). The moment of the signal reflection froim 
the near end is equal to '/2(¢2 + ¢3). Both signals are reflected 
simultaneously (by the clock of A) from both ends of the ruler, 
provided the following condition is met: 


Let Qatiethy (3.52) 


In an imaginary experiment this condition can be satisfied by 
choosing the times of sending of the first and the second signals. 

The first signal from A, however, will be received by the ob- 
server A’, located at the near end of the ruler, at the moment Ki; 
(recall that the initial readings of the clocks of A and A’ coin- 
cided when the observers were located at one point). The signal 
reflected from the far end of the ruler and returning to A at tne 
moment f, will pass A’ at the moment ¢,/K. Indeed, the signal 
received by A’ at the moment ¢,/K will get to the observer A at the 
moment (t4/K)-K = ty. From the viewpoint of the observer A’ the 
doubled length of the ruler fo is determined as the time interval, 
taken by light to reach the far end of the ruler and get back, multi- 
plied by the velocity of light, i.e. 


lft 
=(G—Kn)c=h (3.53) 
As to the relationship between f, and #3, it follows directly from 
the definition of the coefficient K: 
i= K*to. (3.54) 
The imaginary experiments performed to measure length are 
illustrated in Fig. 3.11, which does not require any special expla- 
nations after the diagrams of Figs. 3.9 and 3.10 have been anal- 
ysed. 
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The ruler’s length determined by the observer A is equal to the 
difference of the distances from him to the far and near ends of 
the ruler under the necessary condition that these distances are 
determined simultaneously. This condition is satisfied owing to 
the validity of Eq. (3.52). The distance from A to the far end 
is equal to '/o(t4 — 1) ¢ and to the near end to '/2(t3 — fz) c. Con- 
sequently, A has to assume the ruler’s Jength / equal to 


= 3 [(4— th) — (4, — tJ c. (3.55) 


Eqs. (3.53)-(3.55) make it possible to find the relationship 
between / and Jo. It follows from Eq. (3.52) that f4 = f) + ts — ft. 
Substituting the expression obtained for ¢, into the left-hand side 
of Eq. (3.53) and resorting to Eq. (3.54), we get 

__ 6 (ta+ts—t ef to (K? +1) — th (K?7 +1) 7] __ 
=a (AAR kage eee l= 


2 
=< a (tg — t,). (3.56) 





Since according to Eq. (3.52) fo—t, = ty— fs, it follows from 
Eq. (3.55) that 


(t2 — ty) + (ta — fs) (ta — 1) — (ts — tr) l 





Se ge a ee 
Now Eq. (3.56) takes the form 
fom (5+! oe 
2K v1—B?' 


where in the last equation the formula (3.50) is taken into ac- 
count. This is exactly what we obtained earlier as Eq. (3.5). 

This derivation shows quite distinctly how essential it is to find 
the ruler’s ends simultaneously when its length is determined. 
Incidentally, note that the derivation of Eq. (2.4) involves, in 
essence, a radar approach as well. 

The Lorentz transformation. We have made sure that the K cal- 
culus can be employed to derive all basic principles of the STR, 
the Einstein postulates. The advantage of this derivation lies in 
the fact that there is no need for an explicit introduction of a co- 
ordinate system. 

But, of course, the application of STR methods in physics re- 
quires an explicit introduction of a reference frame. If so, the in- 
troduction of the Lorentz transformation is outright inevitable. 
The Lorentz transformation can be derived by means of the K 
calculus. 

Consider the two reference frames K and K’ with the respective 
observers A and A’ registering the same event. In both fraines the 
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inifial time reading is chosen so that ¢=#t’=0O when both 
origins coincide. Then at the moment ?¢, the observer A sends a 
light signal to A’ which is received by him at the moment 4 by 
his clock; the signal sent by A proceeds further accompanied by 
the signal sent by A’ at the moment when he receives the signal 
from A. In fact, one signal consisting of two propagates along 

the x axis. Let the event 





$ $ P represent the arrival of 
s @& that signal at some point 
RN . 
Observer’. - & (or the arrival of the 
world line Ay signal coincides with the 
A moment of occurrence of 


a certain event). At that 


tiple 
=f (tt) 
te 





Fig. 3.11. The determination of the length Fig. 3.12. The derivation of 
of a moving ruler. the Lorentz transformation. 


point the signal is reflected (or, otherwise, the return signal is sent 
immediately on the arrival of the direct one). First, it gets to the 
observer A’ at the moment #2; at the same moment A’ sends his 
signal in the direction of A. Now the single signal consisting, in 
fact, of two signals propagates from A’ to A. It is received by the 
observer A at the moment f, (Fig. 3.12). 

The observer A will ascribe the coordinates to the event P as 
follows. The time ¢ of the event is just the half-sum of the times 
of sending and reception of the signal since the velocity of light 
on the way “there” and “back” is equal: 


= Fh th). (3.57) 


The distance to the point where the event occurred can be found 
if the propagation velocity of the signal c is multiplied by the 
time which the signal takes to travel “there”; this time is equal to 
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half the total time spent by the signal. Since the signal travelled 
a closed path during the time tg — 4, the event coordinate x will 
be determined by the observer A as 


=> (b—h)e. (3.58) 
From Eqs. (3.57) and (3.58) we obtain 
h=t—-=, hott. (3.59) 


But the observer A’ will find in exactly the same manner that 
7 x’ , x’ 
Kel——, hart. (3.60) 
According to the definition of the coefficient K, and comparing 
the intervals between the exchanges oi signals, we get 


*—0=K(t,—0), t,-0=K(,—0). (3.61) 
According to Eqs. (3.59) and (3.60) we obtain 

v—-+~=xk(1—4), (3.62) 

itt=k('++). (3.63) 


Multiplying crosswise Eqs. (3.62) and (3.63), we immediately 
obtain that the quantity 


ene fem Ae 
c? ct? 


(3.64) 


retains its value in all IFRs, i.e. is the invariant. Having written 
Eqs. (3.62) and (3.63) in the form more convenient for solution 











—-+=K(1—4), (3.65) 
, : l 
f+ l=7 (+4), (3.66) 
we readily find that 
,_ K+ K?—1 ,_ K?+1 K?—1 
c= 2K !——3Ke * x= OK x— OK cl, 


Taking into account Eq. (3.50), we discover that this is just the 
Lorentz transformation: 


x’ = (x— V0), f=C(t—4+ x). 


CHAPTER 4 


THE FOUR-DIMENSIONAL 
SPACE-TIME 


§ 4.1. Three-dimensional and four-dimensional Euclidean spaces. 
When we introduce a coordinate system, the position of every 
point is specified by three numbers which are referred to as the 
coordinates of a point. A manifold of three dimensions is under- 
stood as a set of all points. If we want to pass over from a mani- 
fold to space possessing definite geometrical properties, we have 
to define the expression for a distance between two infinite close 
points of the manifold. Having assigned the square of the distance 
between such points, one can define basic geometric quantities, 
such as a vector’s length, an angle between vectors, areas of two- 
dimensional figures formed by vectors. Geometry, whose principal 
laws were formulated by Euclid, is valid to a high degree of ac- 
curacy in the world that we live in. In accordance with Euclidean 
geometry the square of the distance between two infinitely close 
points can be put down in the Cartesian coordinates in the foi- 
lowing form: 

ds? = dx" + dy’? + dz’. (4.1) 


This equation represents nothing other than the Pythagorean 
theorem written out for the diagonal of a rectangular threc- 
dimensional parallelepiped with the sides dx, dy, dz. 

A coordinate system can be selected at will (the Cartesian coor- 
dinate system is distinguished only for its simplicity), and a 
distance between points, owing to its geometric meaning, should 
not depend on the choice of a system. This means that Eq. (4.1) 
has to be an invariant of any transformation of coordinates. 
A distince between any two points has also to be an invariant 
of the transformation of coordmates. Thus, in Euclidean geometry 
the invariant is the distance between two points: 


Nig = V (x2 — x1)? + (g2 — M1)? + (22 — 21), (4.2) 


where (x1, yi, 21) and (x2, ye, Z2) are the coordinates of two points 
in space. Eqs. (4.1) and (4.2) relate the coordinates of two points 
of space. In particular, the transformation equations for transi- 
tions from one Cartesian system to another are given in Ap- 
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pendix 1, § 2. Such a transition represents a rotation, provided we 
ignore the system’s translation which is of little interest to us. 
It is seen from this equation that in a new coordinate system any 
new coordinate is expressed through all old ones. 

In a three-dimensional Euclidean space one can introduce vec- 
tors specified by a triad of numbers, i.e. vector components. The 
coordinates of a point comprise the components of a radius vec- 
tor. Consequently, components of any vector are transformed ac- 
cording to the coordinate transformation rule. Norms of vectors, 
their dot products and an angle between them are found via 
vector components according to the known rules. 

What would the appearance of one more dimension in a Euclid- 
ean space imply? Certainly, it is difficult to visualize a four- 
dimensional space with one’s own eyes. But there is no such need. 
Having available the principal relationships for a three-dimen- 
sional space, we just carry them over to a four-dimensional space. 
Let the coordinates of a point in the four-dimensional space be 
x, y, 2, w. For a four-dimensional Euclidean space the square of 
the distance between two infinitely close points will be written 
in the following form (the symmetrical designations are also 
given): 


ds? = dx? + dy? + dz?+ dw? = dx” + dx” + dx” +- dx”, (4.3) 
and the distance between points 


Pip = V (x2 — x1)? + (yo — ys)” + (22 — 21)? + (we — w,)?. (4.4) 


Eqs. (4.3) and (4.4) will be the invariants of the coordinate 
transformation, and basic geometrical relationships will be found 
in much the same way as they are found in three-dimensional 
space. 

§ 4.2. The 4-space-time, or the four-dimensional pseudo-Euclid- 
ean space. Let us consider a four-dimensional manifold made up 
of “points” whose coordinates are constituted by four numbers 
x, y, 2, t= ct defining a four-dimensional point. One or another 
event representing an instantaneous physical process can occur 
at any point of this manifold. The four-dimensional space-time is 
a purely geometric notion. Sometimes, following Minkowski, this 
space is called the “world”. Any event occurs at some point of 
the Minkowski world. 

Geometric properties of the Minkowski world can be established 
after some invariant relationship between coordinates of points 
is found, which can be interpreted as the distance between two 
points of a manifold. When the distances between points are 
defined, we may pass from manifold to space. But how can the 
necessary invariant relation be found? It should not be forgotten 
that the coordinates of the “world” points are defined with physi- 
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cally different quantities, so that it is impossible to presume in 
advance that the “distance” in this world can be defined by the 
expression of the (4.3) type. But the theory of relativity answers 
this question unambiguously. Considering only inertial frames 
of reference, the interval between events (Eq. (3.19)) remains 
the invariant for any pair of events, or, in terms of geometry, for 
any pair of points in the Minkowski world. The transition from 
one IFR to another is described by the Lorentz transformation, 
and no other transformation is needed in the framework of the 
STR. Consequently, from physical considerations we can take the 
expression for the square of the interval between events 


ds? = dx? — dx? — dy? — dz®= dx” — dx” — dx” — dx (4.5) 


as the basic invariant quadratic form defining the “distance” in 
the Minkowski world. Here t = ct. It is Eq. (4.5) that defines the 
square of the distance between two infinitely close points in the 
Minkowski world. Thus, the Einstein postulates, from which the 
invariance of the interval between events follows, signify that 
geometry of the four-dimensional space-time, i.e. the Minkowski 
space, is determined by the basic fundamental form of the (4.5) 
type. It is seen from the appearance of this form that coordinates 
and time are not equivalent. 

It will be shown in Supplement V that the transition from 
inertial frames of reference to non-inertia! ones alters the ap- 
pearance of the interval between events. Although this expression 
always remains invariant, its form becomes different, so that the 
square of the interval takes the following form: 


ds? = g,,dx' dx*, (4.6) 


where the indices é and & denote summation from | to 4 and the 
coefficients giz, referred to as metric coefficients, may depend on 
coordinates and time. We need this general equation now only in 
order to write out giz for Eqs. (4.3) and (4.5). Using the sym- 
metric designations of Eqs. (4.3) and (4.5), we obtain respec- 
lively 
80 = = gx = gu = 1, (4.3’) 
20 = 1, 2u = 82 = 3 = — 1. (4.5’) 


It is evident that Eqs. (4.3) and (4.5) differ by signs of metric 
coefficients. A totality of these signs is called a signature of cor- 
responding quadratic forms. The signature of Eq. (4.3) has the 
form (+ +++), while the signature of Eq. (4.5) (+——-—-). 
If a four-dimensional space were formed by a simple increase of 
the number of dimensions of our conventional space, the signature 
would be (+ + + +). Such a space would not differ from our 
space in anything except the number of dimensions and it would 
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be referred to as the Euclidean (four-dimensional) space. Tne 
signature of space considered in the special theory of relativity 
corresponds to that of Eq. (4.5), ie. (+ ———). 

A change of a signature implies a variation of the “distance” 
between points in space, the variation of properties of this space 
as compared to those of the customary Euclidean space. This 
four-dimensional space, possessing unusual geometric properties, 
is extremely important for the STR. Tt is in this space that all 
physical phenomena take place. 

The geometry of the Minkowski world differs from Euclidean 
geometry, but not too much, since the coefficients in Eq. (4.5) as 
well as those in Eq. (4.3) are constant. Accordingly, the geometry 
defined by the quadratic form of Eq. (4.5) is customarily called 
pseudo-Euclidean and the corresponding space a pseudo-Euclidean 
space. 

Thus, the space of the four variables x, y, z, ct of the special 
theory of relativity is the four-dimensional pseudo-Euclidean space. 
It is not originated just by adding the fourth (time) coordinate 
cl to the three spatial ones x, y, z, but through the peculiar de- 
finition (Eq. (4.5)) of the invariant distance between the points 
of this space. 

A physical motive for the consideration of the pseudo-Euclidean 
space lies in the fact that the a erun and time readings pertain- 
ing to an event are not equivalent in the STR in spite of their 
close connection. 

§ 4.3. 4-vectors and 4-tensors. Exactly as in a three-dimensional 
space, coordinates of a point in a four-dimensional space can he 
treated as components of a four-dimensional radius vector drawn 
from the origin of a coordinate system to a given point. All four- 
dimensional vectors will be designated by an arrow over a letter; 
in particular, a four-dimensional radius vector will be designated 


> 
by &. For the convenience of our readers we shall be presenting 
basic relationships both in complex notation and via the real 
variables. Complex notation simplifies the presentation of electro- 
dynamics, while the usage of real variables leads us to the for- 
malism of the general theory of relativity, where the introduction 
of a complex coordinate is of no use. Most of the equations will 
be written in a symmetric notation, and the presentation of coor- 
dinates of a four-dimensional vector in a two-line form will make 
it possible to recall the meaning of the introduced designations. 
Thus, we introduce a four-dimensional radius vector in one of the 


following ways: 
> >» 0 l 
a( Xo X3 X4 ); (a) ®( xe 3x! |) (b) 
t=ct x y 2 
(4.7) 





x y 2 it=iet 
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The usage of superscripts in Eq. (4.7b) is not accidental. When 
real values of coordinates are used, the difference between co- 
variant and contravariant vector components needs to be em- 
phasized (see Appendix 1, § 8), and, consequently, superscripts 
are used with contravariant components. So the square of the in- 
terval between events will be written in the form 


dst=dettdi+ditdc= | ds?=dx” — dx” — dx? — 
=> dxt=g,,dx,dx,; (a) | —dx®=gy,dx'dx*. (b) 
Differing from zero: Differing from zero: ae 
Gu=!, ge=l, go= 1, gue, 
Ge=1, gu=l. &m=—1, gx=—l. 


Note that the intervals given in Eqs. (4.8a) and (4.8b) have 
opposite signs. Since ds? may be either negative or positive, tie 
choice of signs for ds? is of no practical importance. 

The Lorentz transformation is a transformation of four-dimen- 
sional radius vector components, that is coordinates of an event. 
We shall write them out again: 


x, =T (x, + iBx,), x” =P (x9 — By!), 
xy = Xpy (a) | x” =P (x'—Bx),  (b) 
; Se kos (4.9) 
Xs oe X3) “=, 
x4=T (x, — iBx,); r=. 


A four-dimensional radius vector is one of four-dimensional 
vectors, so that if in the reference frame K the following four- 
dimensional vectors are specified 


> > 
A(A,A2A3A,) | A(A°A'A?A), 


the components of the same vectors will be determined as follows 
in the frame K’: 


A’ =1 (A, + iBA,), AY =T (A°— BA), 
A= A, (a) | Av =T(A'—BAy, (b) 
; ae (4.10) 
Ai= Ay AY = A, 
Ai =P (A, —iBA,); AY =A’. 


Eqs. (4.8a, b) represent the square of an infinitesimal vector 
(dR)?. Consequently, the square of the norm of a four-dimensional 
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vector (which is an invariant quantity) must be determined in 
this way: 
‘A t i ak k 
A= gy,A;A,= A? = gipA A* = AyA* = 
=Aj+ A+ Ait Ai; (a) silaieiat, °. (b) 
(4. 11) 


Of course, Eqs. (4. lla) and (4. t1b) give opposite signs for the 


invariant quantity A®. But this is of no significance just as in the 
case when the sign of the interval is detertnined (see the note 
after Eq. (4.8)). It should be borne in mind, though, that different 
signs of the interval alter the conditions defining “time-like” and 

“space-like” intervals and vectors. (There is no harmony in the 
literature concerning this issue.) 

In Eq. (4.11b) we introduced covariant coordinates according 
to the formulae of Appendix 1, § 8: Ax = girA’. It is easy to notice 
that Ao = A°, A, = — Al, 7 een and A; = — A}. 

Just as in the case of a three-dimensional space, we shall have 
to deal with tensors. Most easily the tensor component transfor- 
mation law is derived from the transformation law for a prod- 
uct of two four-dimensional vector components. The transforma- 
tion equations for the components of the four-dimensional vec- 


tors A and B can be written down using the symmetric notation 
(see Eqs. (2.40a, b)): 


A, = &, Ap abc a, A’, 
B,=a4, B; ™ | Br=a, Bm, ©) (4.12) 


Multiplying the left-hand and right-hand sides of these equa- 
tions, we obtain at once the transformation rules for vector com- 
ponent products: 


A,B, = G40, ,4/B.,(a) | A'BY=a,0,,A"B™. (b) (4.13) 


Thus we obtain the general transformation law for the tensors 
Ti, = A, By and Ti* = A'B*: 


Tip = Gy Gig T imp (4.14) 
TY = 0 OymT™. (4.15) 


Eq. (4.15) in which the difference between the covariant and con- 
travariant coordinates is essential, represents the transformation 
law for a twice-contravariant tensor. 

In the 4-space the measured physical quantities should be so 
arranged as to possess quite definite transformation properties 
with respect to a transition from one IFR to another, i.e. to the 
Lorentz transformation. But in the coordinate transformation (in- 
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cluding the fourth coordinate of the Minkowski world) only tensor 
quantities possess the definite transformation properties, and 
tensors of different rank transform according to different rules. 
Hence, all physical quantities to which we ascribe a real meaning 
have to be tensors: either scalars, i.e. zero-rank tensors, or 4-vec- 
tors, i.e. first-rank tensors, or, finally, tensors of a higher-than-one 
rank. We shall see later that an electromagnetic field forms a 
second-rank tensor (see Chapter 6). The transition from customary 
three-dimensional quantities to four-dimensional ones (which is, 
no doubt, necessary in the case of the Lorentz transformation) is 
not always straightforward and is realized differently in different 
cases. It is often possible to represent, with some modification, a 
customary three-dimensional vector as a spatial part of a 4-vector. 
As to the fourth component, its expression seems to be rather sur- 
prising at first, but in the final analysis proves to be natural. 
There is nothing amazing in this since in a non-relativistic limit 
we nearly always come back from relativistic relationships io 
classical ones. 

Chapters 5-7 provide numerous examples of constructing four- 
dimensional vectors and tensors. 

§ 4.4. A pseudo-Euclidean plane. Characteristic features of the 
pseudo-Euclidean space can be illustrated by means of the pseudo- 
Euclidean plane. One of the two < 
coordinate axes must neces- y 
sarily represent the time axis, or 
the axis of time-proportional 
quantities, since in the STR 
purely spatial geometry remains 





Euclidean, and only space-time 7G z 
is described by pseudo-Eucli- L=Iy 
dean geometry. In our choice of (a) 


reference frames it is most con- 


venient to consider the plane Fig. 4.1. (a) The world line of an ob- 
(x, t). The ol line nt En ObecLoy ne 
Recall that the four-dimen- uniformly along the fais. 
sional space-time — continuum 
‘Whcée points represent events is sometimes called the Minkowski 
_world. Every event in our real physical world occurs at a definite 
world point of the Minkowski world. Considering a particle, one 
can regard its staying at a given point at a given moment of time 
as an event. No matter whether this particle moves or not, the 
sequence of events happening with the particle in the Minkowski 
world yields a certain curve called the world_line of the_particle. 
Let us draw the x, t axes of the frame K at right angles to 
each other and analyse the simplest cases. Let a particle be 
located at the point x = x» in the frame K; its world line in the 
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plane (x, t) of the Minkowski world will be a straight line par- 
allel to the + axis (Fig. 4.la). Let another particle move uni- 
formly along the x axis in the frame K at the velocity v. Its 
world line in this frame will be a straight line inclined at the 
angle 6 to the t axis (Fig. 4.10). 
A bit later we shall see that 
6 = arctan (u/c). 

Now we shall examine an ar- 
bitrary motion of a particle in 
this reference frame. The motion 
of this particle is represented 
by the world line x = x(t) in 
the plane (x, 1), as it is depicted 
in Fig. 4.2. 

The inclination of the world 
line to the t axis at each given 
point is determined by the de- 
rivative dx/dt at that point. In- 
deed (see Fig. 4.2), 





World line 
Of tight 


rays dx 1 dx v 


Fig. 4.2. The system of real coordina- 


les x, t = ct. The particle's position at 
a given moment is specified by the 
point in this plane The particle's mo- 
tion is depicted in this plane by the so- 
called world line of a point The world 
lines of motionless points are straight 
lines parallel to the t axis The world 
line of light rays is the coordinate an- 
gle bisector. In the case of the varia- 
ble velocity the angle formed by the 
tangent line to the world line and the 
t axis is defined from the relation 0= 
= arctan (v/c), where v is the instan- 


Thus, the inclination angle is 
determined from the following 
equation: 


6 = arctan = =arctanf, (4.17) 


where f = o/c and vg is the in- 
stantaneous velocity of the point 
or the object. Inasmuch as 
p <1 always, the angle @ can- 


taneous velocity of a particle. not exceed 45° for any moving 


object. The world line of light 
rays will be represented by the bisecting line of the coordinate 
angle. 

We saw in § 2.9 that the v’, x’ axes are obtained from the t, x 
axes as a result of the Lorentz transformation, provided these 
axes are drawn together in a scissors-like manner to the world 
line of light rays. The relativity of simultaneity is graphically seen 
in Fig. 4.3a where the 1’, x’ axes are drawn together with the 1, a 
axes. In the frame K’ all events lying on the x’ axis, or on the 
straight lines t’ = const, are simultaneous. In terms of geometry, 
all these lines parallel to the x’ axis represent the simultaneity 
lines in the frame K’, 
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Let us consider the two events A; and A, lying on the x’ axis, 
both these events occurring simultaneously in the frame K’ at 
the moment ¢t’ = 0. To find the moments of time at which these 
two events occur in the frame K, one should “project” these events 
on the t axis by drawing straight lines parallel to the x axis, 





(a) (6) 


Fig. 4.3. (a) The Lorentz transformation reduces to the rotation of the x 
and t axes through the angle » = arctan B about the origin of coordinates to- 
ward the coordinate angle bisector and their new positions x’, t’. The straight 
lines x’ = const are now parallel to the Ov axis, while the straight lines 
t’ = const are parallel to the Ox’ axis (We have passed over to the rectili- 
near oblique-angled system of coordinates) The relativity of simultaneity is 
clearly seen: the events Ay and Az which are simultaneous in the frame KX’ (ly- 
ing on the straight line ~ == const) are not simultaneous in the frame K To 
find the cespective moments in the frame K, we project them on the t axis by 
means of straight lines parallel to the x axis. (6) Here are two world lines 
of objects (LL and MM). The relativity of the distance between moving objects 
is seen very well To find the distance between them, one has to determine the 
coordinates of these objects simultaneously Let one of the objects be located at 
the point NV. Then in terms of the frame K the second object is at the point R 
at the same moment. But in terms of the frame K’ the second object is at the 
point P at the same moment. The sections NR and NP corresponding to the 
distances between the objects have different lengths. 


since in the frame K’ the events lying on the straight lines +/== 
= const (Fig. 4.3a) are simultaneous. We see that in the frame K 
these events occur at different moments of time ¢, and to. Of 
course, this is only a geometric illustration of the relativity of 
clock synchronization that we dealt with in § 2.4. 

A very important result follows from Fig. 4.36. It shows the 
world lines of two objects moving uniformly but at different veloc- 
ilies. To determine the distance between them at a given moment 
of time, the coordinates of these objects should be found simul- 


126 Special Theory of Relativity 





taneously in the frame in which this distance is being determined. 
It is clearly seen that the distance between objects measured in 
the frames K and K’ proves to be different. Due to the equivalence 
of reference frames none of the distances obtained can be regarded 
true. But then all laws of mechanics, in which force depends on 
distance, become ambiguous in the case of moving objects. Nat- 
urally, this problem did not emerge 
in Newtonian mechanics where 
time was regarded absolute. 

Let us consider the x, + axes 
of the frame K (Fig. 4.4). The 
square of the interval between two 
world points is defined by the ex- 


z--ct ba 







zt 
Absolutely Absolutely pression s?,=(t, — 1,)?—(x,— x,)% 


events events 


For the sake of simplicity let us 
suppose that event 1 occured at 
the point x=0O at the mo- 


Fig. 4.4. The intersection of the 
space-time cone by the plane (x, 1). 
The point O represents event I. All 
events located in quadrants II] and 
IV represent absolutely remote 
events with respect to event O. The 
events located in quadrant I repre- 
sent absolute future while the 
events located in quadrant II abso- 
lute past. 


ment t= 0, i.e. at the point O. 
Any events that occurred on the 
x axis before and after event 1 
are depicted by points in the 
plane (x, t). Since the square of 
the interval, that is the distance, 
from event | to any other event 
is equal to s*? = 1? — x?, this plane 
is subdivided into four quadrants 


I, I, HI, IV by the straight lines 
x =, which correspond to the sequence of events consisting 
in the emission of a signal from the point x =O at the mo- 
ment t = 0 and its arrival at the point x at the moment +. The 
interval between the events located on the straight lines 
1? — x? = 0 is light-like, and the “distance” between such events 
is equal to zero in the pseudo-Euclidean_plane. Now let us consid- 
er the four quadrants exterior to the light-like straight lines. In 
quadrant I s* = 1? — x? > 0. Consequently, the interval between 
any event of quadrant I and event | is time-like. For all events 
of this quadrant t > 0; consequently, all of them will occur after 
event 1, and no choice of a reference frame can alter this situa- 
tion. This means that quadrant I is the region of absolute future 
with respect to O. In quadrant II s? > 0 also, but here for all 
events t <0; hence, quadrant II is the region of absolute past 
with respect to event 1. 
In quadrants III and IV s? <0, i.e. the interval between any 
event located in this region and event | is space-like. All these 
events occur at points which do not coincide with the point at 
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which event 1 occurred, and again it is impossible to alter this 
by the choice of a reference frame. However, one can find such 
reference frames where a given event from quadrant III or IV 
can happen before or after, or, finally, simultaneously with event 
1, since the concepts “simultaneously”, “before” and “later” are 
relative for the events located in this region. 

If one examines two events located arbitrarily in the plane 
(t, x), the character of the interval between them will be de- 
termined from the slope of the straight line connecting these two 
points. If the straight line is inclined 
to the x axis at the angle exceeding 
x/4, the interval between events | 
and 2 is time-like; if the angle is 
less than n/4, the interval is space- 
like. Finally, if this line is parallel 
to the bisecting line, the interval is 
ee : ‘ 

In the four-dimensional space the 
equation describing the propagation Preudo- Pythagorean theorem 
of light has the form c?f? — x? — AB*=BC*-AC 
— y?—z? = 0. In terms of geometry Fig. 4.5, The pseudo-Pythago- 
this equation represents a “cone” in rean theorem in the pseudo-Eu- 
the four-dimensional space. Usually. clidean space. 
this cone is called a light cone. The 
internal cavities of this cone correspond to the regions of “absolute 
future” and “absolute past”. The light cone surface on which the 
light-like directions are located is remarkable owing to the fact that 
its position in the four-dimensional space remains invariable for 
every world point under all transitions from one IFR to another. 

Let an event consist in the arrival of a light ray at a certain 
world point where an observer is located. Thus, we deal with the 
observation of light signals at a given point of space and at a 
given moment of time. The light rays can get at a given world 
point only along those directions of the four-dimensional space 
which lie on the “light cone of past” down to infinity (practicaliy 
far enough in terms of light units). Each generatrix of this cone 
can be associated with the point on the spatial sphere of an in- 
finitely great radius in whose centre the observer is located. Such 
an assumed sphere is used for the observation of celestial bodies 
and is called the sky sphere. 

When depicting the pseudo-Euclidean plane on a sheet of pa- 
per, it should be remembered that we are used to such relations be- 
tween the lengths of rulers, which are customary in the Euclidean 
plane. In Fig. 4.5 a right triangle is shown with the side AC equal 
to x2 — x, and BC to t2 — 1%. But in this plane AB? = BC? — AC’, 
according to the definition of the square of the interval and con- 
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trary to the Pythagorean theorem; so this is the pseudo-Pytha- 
orean theorem. Therefore, the comparison of lengths in the plane 
i t) should be performed cautiously. 

In the Euclidean plane (x, y) the locus of points equidistant 
from the origin of coordinates is defined by the equation of the 
circumference r? = x?-++y?= const. In the pseudo-Euclidean 

plane (x, t) where the square of 

rT the distance from the origin of 
coordinates is defined by the rela- 
tionship s? = 1?— x?, the locus 
of points “equidistant” from the 
origin of coordinates (Fig. 4.6) 
will pattern four hyperbolas (s? 
is not necessarily positive). If one 
chooses the hyperbola for which 
s? = | and draws rays from the 
origin of coordinates till they in- 
tersect with this hyperbola, the 
section of each of such rays will 
determine the unitary “pseudo- 
Euclidean” length in the corres- 
ponding direction. It is possible 
to give the physical interpreta- 
tion for plotting the hyperbola 





OA “0B-0C “OD-1 
OB=0A™=1 
Fig. 4.6. The four equilaterai hyper- 


bolas 1? — x? = i, x?—12 = 1 are 
plotied in the coordinate system (x, 


t). Since the Lorentz transformation 
jeaves the expression 1? — x? = 
= c?f? — x? invariant, we shail also 
obtain the hyperbolas 17? — x’? = |, 
x’? — 72 = —| in the new oblique- 
angled coordinate system. But this 
means that these four equilateral 
DA gs cross the axes x, t, x’, 

at the distances from the origin 
equal to unity The hyperbolas plot- 
ted are referred to as scale hyper- 

bolas. 


s? = |. Let particles having vari- 
ous velocities but the identical 
lifetime to = 1 be generated at 
the world point t=0, x=0. 
Then the locus of the world points 
at which these particles decay, 
will be the hyperbola s? = 1, and 
the world lines of these particles 
will be represented by the rays 
outgoing from the world point (0, 
0) and reaching this hyperbola. 


Let us consider the two pairs of equilateral hyperbolas in the 


plane (x, t): 


V— r=], 
vr—P=l, 


(4.18) 
(4.19) 


One can readily subdivide the plane (x, t) into four quadrants, 


each containing one hyperbola. The dividing lines between the 
quadrants prove to be the asymptotes of these hyperbolas. Indeed, 
substituting the equation of the ray t = kx, passing through the 
origin of coordinates with the arbitrary slope k (k = tana), into 
the equations of hyperbolas (4.18) and (4.19), we discover that 
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the intersection coordinate is determined from the equation x? = 
= +(1/(1 — &?)). This equation has a real root only if k? < 1. 
When k? = |, the coordinate of the intersection point on the x 
axis moves away into infinity. This means that the rays t= ¥ 
are asymptotes of these hyperbolas. 
Thus, the world lines of the light rays 
x = ct are the asymptotes of the hyper- 
bolas defined by Eqs. (4.18) and (4.19). 

Each of these hyperbolas intersects 
only one of the axes: x or t. The inter- 
section points of the hyperbolas (4.19) 
with the x axis are determined from the 
condition t = 0. We see that the hyper- 
bolas (4.19) intersect the x axis at the 





points x = +1. In a similar way one 
can find that the hyperbolas (4.18) in- 
tersect the t axis at the points t= +1. 
Inasmuch as the hyperbolas (4.18) and 
(4.19) cut off the unitary sections on 
the coordinate axes, it is natural to call 
them the scale hyperbolas. 

Since the expression t—x?= 
= cf? — x? is the invariant of the Lo- 
rentz transformation, the equations 
2? — x2 = 1, v?—x? = —1 will be 
vatid in the frame K’. It follows directly 
thaf fhe same hyperbolas cut off the uni- 
tary sections on the new oblique-angled 
axes x’ and v’ as well. 

It is directly seen from Fig. 4.6 that 
the unitary sections of the x and x’ axes 


Fig. 4.7. The geometric illu- 
stration of the relativity of 
ruler lengths Quadrant | of 
Fig. 4.6 is depicted here. OA 
is a ruler at resl in the frame 
K. The world lines of its ends 
are Ot and AA”. The hyper- 
bola x?—712 = 1 intersects 
the x axis at the point A and 
the x’ axis al the point A’. 
Thus, OA = } and OA’ = |. 
To find simuitaneously the 
position of the ruler’s ends in 
the frame K’, the world lines 
of the ruler’s ends should in- 
tersect with some straight 
line vt’ = const, for example, 
with the x’ axis (correspond- 
ing to the moment ’ = 0). 
Then the ruter’s length in 
the frame K’ turns out to be 


are far from being equal. It should be 
remembered though that the representa- 
tion of the pseudo-Euclidean plane in 
the Euclidean one is conditional and the “proper” units of length 
are identically chosen. 

Now it becomes easy to explain in geometrical terms how the 
contraction of a moving ruler comes about. Let us show the x, t 
axes and x’, t’ axes in one figure, and plot that part of the hyper- 
bola that passes through quadrant I of the caordinate systems K 
and K’ (Fig. 4.7). The section OA represents a unitary ruler 
which is at rest in K. Its world lines in the frame K are straignt 
lines parallel to the Or axis and passing through the points O 
and A. But in terms of the frame K’ the simultaneous position 
of the ends of the section OA at the moment 1’ = 0 corresponds 
to the intersection of its world lines with the x’ axis, i.e. to the 


equal to OA”. But OA” < 
<0 


“scl, 
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points O and A”. The unitary ruler in K’ is equal to OA’; it is 
seen from Fig. 4.7 that OA” < OA’ = I. 

Suppose now that a unitary ruler is at rest in the frame K’ 
(Fig. 4.8). Then its length is equal to OA’ and its world lines are 
parallel to the Or’ axis, one of them being the Ov’ axis itself, and 
another the straight line A’B. In order to determine simultane- 
ously the coordinates of the ruler’s ends in terms of the frame K, 


an 





0 BA zt 


Fig. 4.8. The case is illustra- 
jed when a ruler is at rest in 
the frame K’. The world lines 
of its ends are straight 





Fig. 4.9. The geometric illustration of the relati- 
vity of time intervals between two events Let a 
clock be at rest in the frame K’ and be located 
at the origin of coordinates O. Its world line 


. nes parallel to Or’ (the Or’ 
esis itself and the straight 
line passing through B). The 
tuler’s length in K is deter- 
mined by the intersection of 
these world lines with the x 
axis (f = 0) and proves tobe 
equal to OB But OB<OA= 
=|, and we obtain the same 
result: the ruler’s length 
is the greatest in the frame 
where the ruler ts at rest. 


coincides with the Ov’ axis. The reading of this 
clock at the world point B’ differs by a unit from 
its eck § at the point O’. But in the frame K 
the point B’ is simultaneous with the world point 
B (lying on the same straight line t = const 
with the point B’) at which the clock (located 
at this point and at rest in the frame K) will 
indicate the time determined by the section OB 
relative to the reading of another clock from K 
located at the point O. It is seen from the figure 
that OB < O’B’ = 1. This implies that the time 
interval, during which the clock from the frame 


K’ moves, is less in terms of K’ than in terms 
of K. 

the world lines of the ruler’s ends are to be intersected by any 
straight line t = const. It is more convenient for us to draw the 
straight line t = 0. From Fig. 4.8 it is seen that OB < OA = 1. 
Let us dwell on a geometric illustration of the relativity of time 
intervals (Fig. 4.9). Let a clock be at rest at the origin of the co- 
ordinate system K’. Its world line will be the Or’ axis. At the 
moment of time ¢ = 0 a moving clock was at the origin of the 
coordinate system K where we had its reading compared against 

one of the clocks of the system K’ located at this point. 
As before, we suppose that the clocks from both systems show 
the time. = 0 and ¢t’ =0 at the moment when O and O’ coin- 
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cide. Then the sections OB and O’B’ correspond to the time read- 
ings of the clocks of the systems K and K’. 


Fig. 4.10. The same as in the preceding figure, only 7 
now the clock is at rest at the origin of the frame 
K. The world line of the clock is the Or axis. At 
the point B the clock will indicate a time unit 
The events lying on the straight line parallel to 8 





the x’ axis and passing through the point B will x 
be simullaneous with this moment in the frame K’. 
It is clear that OB” > OB’ = |, i.e. a motionless 
clock will register the lesser time interval as com- 
pared to a moving clock. on = 


At the world point B’ the reading of the moving clock will in- 
crease by unity compared to that at the point O’. But the point B’ 
in the frame K is simultaneous with all events located at the 
straight line + = const passing through the 
point B’. In particular. the world line et the bcpepid 
clock located at the point x, and at rest in K : 
passes exactly through the point B’. This 
means that if the moving clock of K’ regis- 
ters the proper-time interval O’B’, the time 
interval registered by the two clocks of K 
(located at the points O and x,) is equal to 
OB. It is seen in the figure that the time in- 
terval registered by the clock of K’ is less, 
because O’B’ = 1 and OB > 1. 

And if the clock is at rest in the system K, 
it will register a time unit at the world point 
B (Fig. 4.10) which is simultaneous with the 
point B” in the system K’ (OB” is the read- 
ing of the clock of the system K’ which an 
observer from the system K will get at the Fig. 4.11. The differ- 
point B”). The point B” is obtained as a re- ence between _ the 
sult of the intersection of the straight line atiee what the of in 
parallel to the x’ axis and passing through jite time registered by 
the point B with the Or’ axis. But OB’ > many clocks of the re- 
> OB’ = 1; consequently, the moving clock ference frame relative 
will again register the longer time interval [© which the object 
than two motionless clocks. The length of the. Rates 
world line arc (in the pseudo-Euclidean plane!) is directly associat- 
ed with the proper time of the object, being just proportional to it: 
ds = cdt. Hence, the length of the world line arc enables us to 
conjecture about the proper time that was registered by the clock 
fixed to the particle. It should be remembered, however, that one 
should be careful in the evaluation of the arc length in the pseudo- 
Euclidean plane. The “risk” is clearly visible from the fact that the 
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“arc length” tor two points located at the finite spatial distance 
from each other may turn out to be equal to zero. Think for your- 
self why in the foregoing reasoning we obtained the correct results 
on the basis of geometry. Naturally, the peculiarities of the pseudo- 
Euclidean plane interfere with the interpretation of the results. 
As an example let us consider the difference between the proper 
time and the coordinate one, i.e. the time registered by the clock 
of the system relative to which an object moves. Let the clock Q’ 
be at rest at the origin of the system K’ and its world line be OA, 
(Fig. 4.11). ae usual, the coinciding clocks at O and O”’ indicate 
t=0,/ = 

The Sond. ‘lines of all clocks Q at rest in K are represented by 
straight lines parallel to the t axis. At the world points Aj, As, 
Az, ... one can check the clock Q’ against the clocks Q;, Qe, Qs, ... 
synchronized in K and indicating the common, unified for K, time 
at any world point A, Ao, A3,... . Its value at the world point A, 
is equal to the length of the world line Q,A;. For the clock Q’, 
however, the length of the world line connecting O’ and A, is 


equal to OA;. But = OA?=QiA7—OQi, from where it is clear 
that OA; < Q1A). This implies that the clock Q’ checked against 
the clocks Q;, Qo, ... at rest in the frame K is slow compared to 
the ciocks Q), Qe, ... synchronized in the frame K. 


t 


B 
Fig. 4.12. The world lines of two “twins”. The world 
line of the “traveller” is the broken line OAB, that of 
the “stay-at-home” the straight line OB The “travel- 
ler” undergoes an acceleration when he reverses his mo- 
tion direction at the point B and thereby gets into a 
non-inertial reference frame for this time interval. The 
A length of the world tine of an object determines ils 
proper-time interval. The proper-tine interval is obvi- 
ously Jess for the “traveller” than for the “stay-at- 
home” (see the pseudo-Pythagorean theorem in Fig. 4.5), 


a 5 


Finally, let two persons (“twins”) be at the point O at first. 
Then one of them (“a traveller’) moves uniformly and rectili- 
nearly except for a short time interval needed to reverse the veloc- 
ity direction before returning to the initial point O. The other 
“twin” remains at the point O all the time. It is seen from Fig. 4.12 
that the world line of the “traveller” OAB is longer than that of 
the “stay-at-home”. However, in accordance with the pseudo-Py- 
thagorean theorem this means that the “traveller” spent less of 
his local time than the “stay-at-home” did. We shall come back 
again to this problem in Chapter 8. 


CHAPTER 5 


RELATIVISTIC MECHANICS 
OF A PARTICLE 


The Einstein principle of relativity is valid provided that the 
basic laws of physics are formulated similarly throughout all 
inertial frames and differ only by the notation of variables asso- 
ciated with the given reference frame. In terms of physics the last 
statement implies that in every IFR measurements are carried out 
by means of instruments which are at rest in that frame. But the 
transformation of the coordinates of an event on transition from 
one IFR to another is the Lorentz transformation. Consequently, 
the equations of mechanics, for example, have to retain their ap- 
pearance (in the above-mentioned sense) in any IFR. This condi- 
tion is automatically fulfilled if the equations of mechanics are 
put down in a four-dimensional vector form. Indeed, in this case 
the transformation law of the left-hand and right-hand sides of 
such an equation is known, and it does not change the appearance 
of the equation. When put down in the vector (or the more general 
tensor) form, the equation is said to be written in the covariant 
form. 

The Newtonian equation relating forces and accelerations is 
covariant relative to the Galilean transformation, although it is 
not covariant relative to the Lorentz transformation. However, 
the Lorentz transformation follows unambiguously from the Ein- 
stein postulates which are for certain confirmed experimentally. 
In order to satisfy the principal Einstein postulate on the equiv- 
alence of inertial frames of reference, one has to ensure the co- 
variance of the equations of mechanics under the relativistic trans- 
formation of coordinates and time, i.e. the Lorentz transformation. 
The required equations of mechanics are fairly easy to write us- 
ing the STR’s four-dimensional geometric concept. We shall pro- 
ceed in just this manner. 

Certainly, the development of science does not cancel previously 
known (“correct”) laws, but only sets limits to their application. 
There is always some conformity between various theories de- 
scribing one and the same group of phenomena in extreme cases. 
The majority of equations of classical mechanics correspond to the 
¢xtreme cases of relativistic equations with B > 0. In other words, 
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classical mechanics is the extreme case of relativistic mechanics 
corresponding to the velocities which are small in comparison 
with that of light. Nevertheless, relativistic mechanics brings for- 
ward such conclusions that could not even be alluded to in the 
framework of classical mechanics (for example, the existence of 
the rest energy of an object). 

§ 5.1. A 4-velocity and 4-acceleration. To write down the rela- 
tions between physical quantities in space-time, we must con- 
struct the required 4-vectors. While doing this, we should re- 
member that in the extreme case of small velocities the Lorentz 
transformation turns into the Galilean one, the relativity of time 
intervals and lengths does not manifest itself any more, and the 
Newtonian equations correspond to the Galilean principle of rela- 
tivity provided it describes the transition from one IFR to an- 
other. In this extreme case time and space are not related, and we 
can utilize conventional three-dimensional quantities. Therefore, 
while composing four-dimensional quantities, we shall always 
try to make their three (spatial) components resemble the corres- 
ponding three-dimensional quantities. In the extreme case of small 
velocities (B—-0) the three components of four-dimensional 
quantities must turn into the conventional mechanical quanti- 
ties. 

We shall compose a 4-velocity and a 4-acceleration in the same 
way as we do the corresponding quantities in the three-dimension- 
al space where a particle position is specified by the three-di- 
mensional radius vector r and the 3-velocity is determined as the 
derivative of the radius vector with respect to time, dr/dt. The 
4-velocity cannot, however, be defined as a derivative of the 4-ra- 


dius vector R with respect to time. To get the 4-vector velocity, 


we have to divide the 4-vector of the increment dR by a scalar (an 
invariant of the Lorentz transformation). But neither time nor 
its differential is a scalar. 

One can take the interval or the proper time of a particle (see 
§ 3.3) as an invariant time-dependent quantity. We shall intro- 
duce once more the pruper-time concept, having associated it with 
the interval between events. We make use of the fact that the 
motion of a particle in the 3-space is a continuous sequence of 
events consisting in a particle occupying a definite point in space 
at a given moment of time. Let the coordinates of a particle in 
the frame K change by dx, dy, dz during the time dt, and its dis- 
placement be equal to d/=~/dx? + dy? + dz*- Consider the in- 
stantaneous inertial frame K’ co-moving with the particle, i-e. 
the frame moving at the constant velocity V equal to the in- 
stantaneous velocity of the particle. In the frame K’ the coordi- 
nates of the particle do not change during the infinitesimal time 
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interval dt’: dx’ = dy’ = dz’ = 0. The interval between events is 
invariant, so that 


ds? = ¢? df — dx? — dy? — dz* =? dt”. 


In the frame K’ the time interval dt’ is the proper-time interval. 
In this chapter we shall designate it by dt (we shall not use the 
designation t= ct as we did in previous chapters). From the 
foregoing equation we have 


-_ dx?tdy?+d27 ,, 
at= Sof; tee aa at= 


=/1-+(4) 4(4)ya= afi —& dt. 


We have obtained the familiar result (§ 3.3) and demonstrated 
the invariance of the proper time (dt = ds/c). Here are the equa- 
tions to be needed later: 


dt=dt/y, ds=cdt, y=(1—6)"", B()}=v(i)/c. (5.1) 


We see that the proper time of a particle is registered by a clock 
of an instantancous co-moving IFR. But these instantaneous 
co-moving IFRs change during a finite time interval in the case 
of a particle moving with an acceleration. The final proper time 
of such a particle is defined as the overall time registered by many 
IFRs. As a matter of principle, the clock should not be rigidly 
linked with the particle, since any acceleration affects the clock 
rate. The proper time can be registered by the clock fixed rigidly 
to the particle only if the acceleration to which this particle is 
subjected does not affect the clock rate. The “proper time” can, 
however, be readily obtained from the time registered by the clock 
of the frame K (relative to which the particle moves) provided 
that the time dependence of the particle velocity, i.e. uv = u(t), is 


known: 
c= | lafi—= — Fat. 


It is seen from the last equation and equations (5.1) that the 
coordinate time, that is the time registered by all! clocks of K, is 
a function of the proper time t. From the equation ds = c dt one 
can see that in addition to the proper time dz one may equally 
use the interval ds, with all equations differing by various powers 
of the invariant factor c. 

Now let us introduce the 4-vector velocity 


> dR 
Van 9,2) 
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Since dt is an invariant and dR a vector, Vv is also a vector, 
no doubt. Let us disclose a three-dimensional connotation of the 
first three components of (5.2) in the notation of Eq. (4.7a): 


d ti, 
a= 72 =y = = VU, (a= l, 2, 3), (5.3) 








where v, are the components of the conventional 3-velocity. There- 
fore, the first three components of the 4-velocity are those of the 
conventional 3-velocity multiplied by the factor y depending on 
the absolute value of the particle velocity. The fourth component 
is to be found separately: 

uy St my SO Kiet) = icy. (5.4) 


In accordance with the notation of Eq. (4.7) we have 


dx —— d(ct) dx® dx® 


0 — = = a — — = yy ——— = 
" dt rT, Vey oe dy Yap — Yea. 





Similarly to Eq. (4.7a, b) one can write 


>f{ Uy Ug Uy Uy fw ub 2 
i( ey: @ | Of ). © 6.5) 
YU, Wy YU, icy Yo Vy YUy YOz 
When 6 — 0, i.e. when the velocity of an object v <c, the factor 
y ~ |, and the first three components of the 4-velocity of (5.5a) 
as well as the last three components of Eq. (5.5b) coincide with 
the conventional velocity. Of special interest is the fourth com- 
ponent of (5.5a) and the zeroth one of (5.5b) for the 4-velocity. 
They are different from zero even when a particle is at rest (if 
v=0, y= 1 and uw, = ic, but u°=c). The last result has the 
obvious meaning: time cannot be stopped, it always flows without 
interruption. Accordingly, there is no quiescence in the four-di- 
mensional world (in the sense that V 0). As to the “velocity 
of the time flow”, it is defined by the choice of time units, of 
course. 
The components of the 4-velocity can be also put down as fol!- 
lows: 





Viyo, icv); (a) | Vey, yo). (b) (5.6) 


The Square of the 4-vector is an invariant. It can be found from 
Eqs. (4.lla) and (4.11b) respectively: 


> > 
V2 = yu? — ep — 0% V2 = (c2y? — yn?) = 2. 


The computation is easiest when made in the inherent ref- 
erence frame of a particle at rest (v = 0). Then in (5.5a) only 
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u, = tc will differ from zero, and in (5.5b) only u° = c. Conse- 
quently, 


> 52 0 12 
V=+He+u+u=—c% (a) | V=eu —u — 
—u? —y=c (b) (5.7) 


the squares of the 4-velocity in Eqs. (5.7a) and (5.7b) are op- 
posite in sign due to the different determination of the interval 


> 
(see Chapter 4). When the appropriate determination of V? is 
chosen, however, this sign does not change, and it follows that 
v <c in all cases. 

As soon as the velocity in 4-space is written down in the form 
of a 4-vector, the transformation equations for the velocity com- 
ponents on transition from one inertial frame to another can be 


obtained at once. Let the components of the 4-velocity v 
(U1, U2, Us, Us) be specified in the frame K. In accordance with 
Eq. (4.10a) we shall obtain in the frame K’ 


uj =P (u,+iBu,), uj=u, ujy=u, u,=T(u,—iBu), (5.8) 


but the 4-velocities have the components V (yo, icy), V' (y'v’, icy’). 
Having substituted them in Eq. (5.8), we get 


yu, =C(yo,— VV), vo=yo, v'u,=yo,, 
icy’ =T (icy — iByo,). (5.9) 


It follows from the last equation of (5.9) that 
1 


Tr (1 —="r) 


Substituting this expression in the first three equations (5.9), 


/ v ’ Y s Y 

w=y lV): Oy rly U2 TF On 
we shall obtain the equations for the velocity components in K’ 
which were derived in Chapter 3 from the Lorentz transforma- 
tion. 

Note, incidentally, that if in place of Eq. (5.8) of transition 
from K to K’ one uses the equations for the reverse transition 
from K’ to K, the following equation is obtained 

Ve. / 

S=r(it yo). (5.10’) 

instead of Eq. (5.10). This way we obtain the value of y/y’ in 
terms of the velocity components in the frame K’. From Eq. (5.10’), 


Se 
S (5.10) 
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follows Eq. (3.41) derived otherwise here: 


Vi—p? V1— BP 

J1— pe =t-—_+,_ = FP vi 
v yT (t+) l+—07 
ct = c* 


It follows from Eq. (5.10) that if a particle is at rest in K (v=0), 
then y’ =T; this result is obvious, because a particle which is at 
rest in K moves at the velocity —V relative to K’. 

The same result is obtained if one makes use of Eq. (5.6b) and 
the transformation equation (4.10b). We suggest that the reader 
do it himself. Our result is obvious: the spatial components of the 
4-velocity determine the transformation of the conventional 3-ve- 
locity. 

Now we are to define the 4-acceleration which we shall also 
construct as a 4-vector: 

> dR av 
v= 7 ar ae (5.1 1) 


or expressed via components: 
du d?x dub — dx! 
wa ae @) | waar: 
Below we shall write out a few formulae dealing with accelera- 
tion, using the notation of Eq. (4.7a). They will be needed only 
in special cases. The four-dimensional acceleration components 
can be expressed by means of the three-dimensional components 
of the vectors v and v. We get 


(b) (5.12) 


d dt d d u 
=a (Wada = War + Y= es + eer (5.13) 


because, as it is easy to verify, 








p d . Inn 
y=F=ve= tS. (5.14) 
and dt/dt = y The fourth component of the acceleration is 
Oe | Se er ic d tlgee J 
w= (icy) 4 = icy B= EY = icy'pp =+ ease » (5.15) 


In the case of the uniform motion (v = 0) all four acceleration 
components turn into zero. In the reference frame in which a par- 
ticle is at rest 

wi=0, W=d,, wW=s, w=0, (5.16) 
i.e. the three spatial components of the 4-acceleration coincide 
with the conventional three-dimensional components of the accel- 
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eration, while the time component turns into zero. It is seen from 
Eq. (5.16) that 


> - 
w= w=? > 0. 

Due to the invariance of the square of the 4-vector norm (see Ap- 
pendix I, § 1) one may regard the 4-vector acceleration as a space- 
like vector (see the definition of the interval (Eq. (4.5)). 


Let us write out the components of the 4-vector acceleration @ 
in the notation of Eq. (4.7): 
2 it ic dy? 4 ge , iy d& 
wo (v 3 (vo), =) = (vo + vbbe, 2), a) 
> d d d& d (5.17) 
w (v4 (cv), Voz (v0) = 1, v ar (v0). (b) 
The particle energy & introduced here will be defined later on 
(see Eq. (5.32) below). Using Eqs. (5.15) and (5.17), one can 
easily obtain 


w? = wi = y°(o°— 4 [v0}) > 0. (5.18) 


Now let us write out the transformation equation for the 3-ac- 
celeralion (v = dv/dt, vo’ = do’/dt’) on transition from one IFR to 
another. The Galilean transformation leaves the 3-acceleration of 
a particle invariable. The Lorentz transformation changes the 
3-acceleration components. The simplest way to derive the trans- 
formation equation for the 3-acceleration components is as fol- 
lows. Regarding v, and vy; as functions of ¢ and ?¢’ respectively, 
and taking into account the relationship between ¢ and ?¢’ (Ey. 
(2.16)), and, finally, designating (v,/c)=6,, (vi/c) =f, etc., we 
shall obtain from Eq. (3.26) 
ees do, ahs do, — B (B), do, — 8, dv,) 

* P+ Bey P (1+ BB) 
and in much the same way, dvz. Having divided the left-hand and 
right-hand sides of these equations by the left-hand and right-hand 


sides of the equation at =T (di’ +x’), respectively, we get 


d 


F 1 vr 
od T (1 + BB;) °x 
is oy +B(6.%,— ys), _ 92+ B(BY, — 205) 
¥ r(i+5B,  ° 1? (1 + Bp)? 
Of course, the same result will be obtained via the transforma- 
tion of the 4-vector acceleration w. In the frames K and K’ it is 
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easiest to put it down in the form w (y2n + yoy, icyy), 
w’ (y20 + yoy’, icy’y’). 
From the transformation equations for the 4-vector components 
w, =I (wv, — iBw;), w,=T (w, + iBw)) 
we get 


yo, Fop=0 (yo, +0, + VV), 


pally + Bie + V8} 


Using Eq. (5.10’), 0s will be found from these relations. Sub- 
stituting we = w2 and ws = w3 in the transformation equations, 
we obtain the equations for transformation of ov, and dz. 

The transformation equation for the 3-acceleration components 
involves the velocity of a particle. But the 3-acceleration appears 
only when the velocity varies. Consequently, even when the 3-ac- 
celeration 1s constant in one IFR, it varies with time in all other 
frames: in relativistic mechanics a uniformly accelerated motion 
in one IFR is not such in all others. 

§ 5.2. A 4-force and a four-dimensional equation of motion. Here 
are some three-dimensional classical relations to be referred to 
later on quite often. In classical mechanics a mass of a particle 
is treated as a constant. We shall designate it by m. The second 
law of Newton is put down as follows in classical mechanics: 


d 
Te (mv) =F (5.19a) 
or 
a =F, (5.19b) 


where F is a three-dimensional vector of a conventional force; 
the quantity p = mv is called a classical momentum of a par- 
ticle. 

Multiplying the left-hand and right-hand sides of Eq. (5.19a) 
by udt, we derive by means of simple transformations the corol- 
lary of the second law of Newton: 


d (mv?/2) = Fo dl. (5.20) 


The right-hand side of Eq. (5.20) represents the work accom- 
plished by the force F; in accordance with the energy conserva- 
tion law the left-hand side must incorporate the change of energy. 
Hence, the energy of a particle can be defined as T = mv?/2 with 
an accuracy of a constant addendum. Here the addendum is 
adopted to be equal to zero, so that the particle at rest possesses 
no energy. Consequently, the energy mv?/2 is associated only with 
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the motion of the particle; accordingly, it is appropriately called 
the kinetic energy (the “motion energy”). It should be pointed out 
that if we assumed that a motionless object possesses the energy 
&, the “total” energy of a moving object would be € = 7+ &p. 
The constant &» can be interpreted as a permanent potential, or 
internal, energy. But in classical mechanics there is no reason to 
do this, and @& 9 can be treated as an arbitrary constant. Conse- 
quently, in classical mechanics the “total” energy of a free object 
may have either sign in principle (depending on the sign of the 
constant &o). Customarily &) = 0, and the total energy of a free 
object coincides with the kinetic one. 

Suppose now that a particle is in a potential field, i.e. the force 
acting on a particle can be expressed as F = — grad U, where 
U(x, y, 2) is a potential energy. Since v dt = dr and grad U dr = 
= dU, Eq. (5.20) will take the form 


d (mv?/2) = — dU. 
from where follows the important law of classical mechanics, thar 
is the law of conservation of the total energy: 
d . 
in other words, T + U = const. 
Now we can pass over to the definition of a 4-momentum of a 


particle P. As in the case of the 3-momentum (p = mv) we specify 
the 4-momentum as the product of the invariant (scalar) mass m 


by the 4-velocity V, so that P = mV. Therefore 


P (myo, imyc); (a) | P(myc, myv). (b) (5.21) 


As it will be clear later, the invariant mass m is expediently 
called a rest mass. Analogously with Eq. (5.19) one may suppose 
that the four-dimensional equation of motion has the form 


dP > 
eH (5.22) 
or in components 
du, x S 
mae =Si, (5.23) 


where the differentiation is naturally accomplished with respect 
to the invariant proper time dt (otherwise a vector relation will 
not be obtained), while the right part of the equation contains the 


4-vector force F(%1, &2, $3, G4), whose components are still to be 
determined. 
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We shall recall once more (see § 4.3) why it is important to 
have the motion equation in a four-dimensional vector form (Eq. 
(5.22)). The point is that in accordance with the first postulate 
of Einstein all basic laws of physics must have the same form in 
all IFRs. In mathematical terms this means that equations describ- 
ing physical laws have to be represented in the covariant form 
with respect to the Lorentz transformation. Equations are said 
to be written down in the covariant form if their left-hand and 
right-hand sides change alike under the Lorentz transformation. 
But this implies that the left-hand and right-hand sides must be 
correspondingly either scalar (invariant) quantities or 4-vectors, 
or tensors of the same rank (to be dealt with in Chapter 6). This 
is enough to ensure the invariance of relations presented in this 
form on transition from one IFR to another. Having put down 
the motion equations in the vector form (Eq. (5.22)), we ensured 
the covariance of this equation under the Lorentz transformation, 
i.e. the universal character of the Einstein principle of relativity. 


The components of the vector dP/dt are familiar to us, for we 
know the components of dV/dt from Eq. (5.17a, b) and m is the 
invariant: 


dP(. d Rape! 
—(yv— (myo), iy (myc)), (a) 
dt ( dt dat ) (5.24) 


& (v am (myc), 4. (myo)) . (b) 


We have denoted the components of the 4-vector F by the gothic 
letter § supplied by indices, i.e. F ($1, Fo, Fs, Fs) or F(F°, F', F, F). 
Equating 4-vectors, we equate their components. The first three 
components of Eq. (5.24a) and the last three components of Eq. 
{5.24b) are obtained as follows (a = 1, 2, 3): 


y (myva)=Bar (a) | vor (mya) =. (b) (5.25) 


Now let us determine the first three components of the 4-force 
§,. Obviously, they are proportional to the 3-force components 
since in the limit transition B > 0 we have to get back to the con- 
ventional equation of Newton. If one retains the conventional de- 
finition of force and supposes, as before, that a “force determines 
a change of momentum”, one should write 


Ba = VFa, 8° =yFa, 


where F, are the components of a conventional three-dimensional 
force. Having substituted the expressions for §, into the right- 
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hand side of Eq. (5.25a), we obtain 
Z(myva)= Fa (a=1, 2, 3), 


and having multiplied each of these equations by a corresponding 
unity coordinate vector m, and the relations thus obtained summed, 
we get the motion equation in a vector form: 


& (myv) = P. (5.26) 


Comparing Eq. (5.26) with the non-relativistic motion equation 
(5.19), we notice that they differ only in the definition of a mo- 
mentum. The relativistic (three-dimensional) momentum is rep- 
resented by the quantity 


p= myo; (5.27) 


in this case Eq. (5.26) resembles Eq. (5.19) in outer appearance. 

Thus, the three spatial components of Eq. (5.23) have yielded 
the second law of Newton in a relativistic form. However, the 
meaning of the fourth (or zeroth) relation is yet to be cleared up. 
In order to do this, § (or §°) should be known. But it turns out 
that having the three components of the 4-force determined, we 
thereby get the fourth component also determined. One can make 
sure of this in the following way. Differentiating Eq. (5.7a, b) 
with respect to t, we get 





du, dua duy du, —_ 
uae tug tus dt + uar =9, (a) eos 
x du’ =i du! qe du? 255 du® =0. (b) (5.28) 
dt dt dt dt 


But according to Eq. (5.23) du,/dt = §,/m, with the first three 


components of § being determined by Eq. (5.26) and the compo- 
nents u, by Eq. (5.5). Hence, Eq. (5.28a), for example, can be 
rewritten in the following form: 





yF vF Fz. 8 
Ys =——* + yoy —* + yo, <= + icy =0, 
whence §, (and similarly §°) can be derived at once: 
Ss =—2 (Po); (a) | B= (Fo). (b) (5.29) 


We give the reader a chance to derive Eq. (5.29b) by himself. 
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Thus, we have found the components of the 4-force F referred 
to as a Minkowski force: 


(BB eB 
& vF, yF, i + (Fv) 
of Sn 
aes vWF, yFy Fz 


)=(v, it(Fo)), (a) 
(5.30) 


)=@ (Fv), yF). (b) 


To clear up the meaning of the fourth relation of Eq. (5.23) in 
terms of Eq. (5.5a), or the zeroth one in terms of Eq. (5.5b), the 
corresponding components in Eqs. (5.24) and (5.30) should be 
equated: 


. a 3 
iy + (myc) =i (Fo), 
or otherwise 
& (myc?) = Fo. (5.31) 


Here we can go over the same reasoning which was evolved in 
connection with Eq. (5.20). The right-hand side of Eq. (5.31) rep- 
resents the work performed by the force; the left-hand side must 
contain the energy change. Let us define the total energy of a 
free relativistic particle as 


& == mc?y = me? (1 — p’)~"*; (5.32) 


here 6 = u/c, where v is the absolute value of the three-dimen- 
sional velocity of the particle. It should be pointed out that the 
energy of the particle is determined from Eq. (5.31) with an ac- 
curacy of a constant value. Eq. (5.32) means that a motionless 
particle (v = 0, B = 0) possesses the energy 9° = mc?. Such u 
value for the constant is not chosen at will, but comes from the 
limit transition to the classical velocity summation formula. 

We shall postpone the discussion of the relativistic motion equa- 
tion (5.26) and the relativistic energy relation (Eq. (5.32)) till 
§§ 5.3 and 5.4. In the meantime we shal! dwell on the transfor- 
mation of the 4-force and the consequences following from it. 
Let us write out the force transformation law (in terms of Ea. 
(5.30a) ): 


Si =P Git BI), WH WHF, wW=HL(G—BI). (6.33) 


We shall begin with a simple case. Let the three-dimensional 
force F act on a particle which is at rest in the frame K®°. Then in 


accordance with Eq. (5.30a) fo (F°, 0). From Eq. (5.33) we ob- 
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tain 

VFr=TFy, yFy=Fy YFe=Fe y Fo =—IVFy. 
In the considered case a particle moves relative to K’ at the 
velocity of the reference frame K®, i.e. at the velocity —V. Conse- 


quently, y’ =T, and we obtain the component transformation for- 
mula for the force and the work accomplished by this force: 
FL =F, Fy =PoV1—B, Fe= FV 1 —BY, Fo’ =— VF. (5.34) 

It is seen from the first three relations (5.34) that the force 
components parallel to the relative motion velocity remain in- 
variable. The force components normal to the relative motion ve- 
locity change. It is easy to find out the meaning of the last relation 
of (5.34). If the particle was at rest in the frame K°, it moves at 
the velocity —V in the frame K’. The work is performed only by 
the force component F, (all other components being norma! to 
the motion direction). The power developed by the force F% in 
the frame K® is equal to — FxV which corresponds to the result 
that we obtained. 

From Eqs. (5.34) one sees that in non-relativistic case, when 
B < I, the three-dimensional force does not change on transition 
from one IFR to another. This fact wholly agrees with our intu- 
itive ideas of the force invariance in any reference frame. However, 
in the presentation of the STR and, in particular, in the derivation 
of certain relations of the STR, when making use of the transfor- 
mation of forces, one has to emphasize first of all the variation 
of force components on transition from one IFR to another. 

In a general case, using Eq. (5.30a), we obtain from Eq. (5.33) 


aos R tf a 
y F,=T[vF,—— v(Fo)], vyFy=yFyp yFe=yFo 
Vv’ (F'v') =P [y (Fo) — VyF;). 
Rewriting the last equations in the form 


, a B , y , 
c=+0[F.—— (Fo), Fete, Fear 


Yy 
(F’v’) = = lr [Fv — VF,] (5.35) 
and taking into account Eq. (5.10), we obtain finally 
B 
Fy — — (Fo) F,v1— 
Are. eee te ees 
Fx =—- V ’ Fy are V , 
| — a Ux l1— or Ux 
Fy eM pry = Pe (5.36) 


V \ 
1-7 Ux 1-7 % 
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It is seen from the transformation formulae (Eq. (5.36)) for a 
4-force that if there is no three-dimensional force in some IFR, 
such a force cannot appear in any other IFR. Thus, forces trans- 
form on transition from one IFR to another, but they never appear 
or disappear. 

A validity of the law of inertia in all IFRs follows from this 
immediately. If in one IFR no forces act on an object and it 
moves due to inertia (v = const), the same situation will be 
observed in any other IFR (see Eqs. (5.27) and (5.32)). 

§ 5.3. A three-dimensional relativistic equation of motion of a 
particle (the second law of Newton in a relativistic form). Hav- 
ing written down the motion equation in a 4-vector form (Eq. 
(5.23) ) and determined the components of the 4-force (the Min- 
kowski force), we satisfied the principle of relativity for one thing, 
and, for another, obtained the four components of the motion 
equation. The three components provided us with the “motion 
equation” per se in a three-dimensional form (Eq. (5.27)), while 
the fourth component permitted us to determine the relativistic 
expression for energy (Eq. (5.32)). Eq. (5.27) was derived on the 
assumption that equations of dynamics must retain their ap- 
pearance in all IFRs, i.e. they must be covariant with respect to 
the Lorentz transformation. However, even without passing from 
one IFR to another, we are aware that the exact equation of 
motion is represented by Eq. (5.26) and not by Eq. (5.19). Let 
us write out these two equations side by side and clarify the 
difference between them: 


< (mv) =F; (a) | S(my)=F. (b) (5.37) 


First of all, it is clear that when B = o/c <1, ie. y & 1, Ey. 
(5.376) passes into Eq. (5.37a). This means that classical me- 
chanics is the extreme case of relativistic mechanics when par- 
ticles move at non-relativistic velocities. Moreover, to satisfy the 
Galilean principle of relativity in classical mechanics, the Gati- 
lean transformation has to be valid, requiring B< 1 (see § 2.7), 
ie. the relative velocity of the considered reference frames must 
be non-relativistic as well. 

Sometimes, from the comparison of Eqs. (5.37a) and (5.37b) 
one may conclude that the difference between them consists in the 
fact that in Eq. (5.37b) mass depends on velocity, so that taking 
my for a relativistic mass, we obtain a classical equation. Now 
we shall see that the things are much more complicated, and in 
Supplement IV we shall discuss why there is no sense in introduc- 
ing a dependence of mass on velocity. 

In order to compare Eqs. (5.37a) and (5.37b), the left-hand 
side of (5.37b) should be rewritten using Eq. (5.32) and the fol- 
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lowing identity (see also Eq. (5.31)): 
d d (& dé , & d d 
a) =a (3°) =a + ear aa (Fo) tomy Fe. 


Rearranging the terms, one may rewrite Eqs. (5.37a) and (5.37b) 
as follows: 


d 
m=F, (a) 


m= [FP —S(Fo)|==[F—B(FB). () 6.38) 


It is noteworthy that in an IFR co-moving with a particle (v = 0) 
Eq. (5.38a) coincides with Eq. (5.38b). It is immediately seen 
from these relations that the principal difference of the relativ- 
istic law of dynamics, Eq. (5.37b), from the classical one, Eq. 
(5.37a), lies in the fact that in the former the direction of a 
3-acceleration does not coincide, generally speaking, with that of 
a force. Consequently, the straightforward comparison of the force 
and acceleration components, readily carried out for the case of 
Eq. (5.37a), is quite impossible here. It is seen from Eq. (5.38b) 
that there are still two cases for which an acceleration and a force 
are oriented along the same direction and the definitions of mass 
in Eqs. (5.37a) and (5.37b) can be directly intercompared. 

(a) Let the force acting on a particle be always normal to its 
velocity, ie. F Lv. Then from Eq. (5.38b) the motion equation 
is iinmediately obtained in the following form: 


my = F, (5.39) 


where, according to Eqs. (5.31) and (5.32), y = const. This case 
is realized when a charged particle moves in a constant magnetic 
field. The Lorentz force F =e [vB] is so directed that Fu = 0 
(always). It may be said that the motion of a relativistic particle 
in a constant magnetic field proceeds in accordance with the clas- 
sical motion equation (5.19), but with some effective (but con- 
stant) mass my. This is valid, however, only for a special case, 
when the condition Fv = 0 is satisfied. To make sure, let us con- 
sider another case. 

(b) Let a force acting on a particle be always directed along 
its velocity. This implies, naturally, the rectilinear motion of a 
particle (in case of a definite choice of the initial velocity). 
A simple example of such a motion is provided by a charged par- 
ticle moving in a plane capacitor when the initial velocity is 
directed along the electric field. If F ll v, then v(Fv) = F(vv)= 
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== Fy?, and from Eq. (5.38b) we obtain the following motion equa- 
tion: 


my <= F, 


where y is a variable quantity. 

Thus, in the two special cases permitting of an intercomparison 
of Eqs. (5.384) and (5.38b) we get a different dependence of mass 
on velocity; this indicates that there is no universal dependence 
of mass on velocity. It is sound practice to use the invariant rest 
mass (see Supplement IV). 

As in classical mechanics, the equation of dynamics can also 
be written for the case when the rest mass of a particle varies 
due to the exchange of energy and momentum with the en- 


vironment. If a particle loses the 4-vector momentum TI (I1,) = 
= (vil, ~ yO) per unit time due to convection, Eq. (5.22) should 
be replaced by 


dP 
a P+, 
or 
f= (mu) =3, + 0p (5.40) 


where only §, is a genuine mechanical force satisfying the condi- 
tion FV = 0. Writing out the components of Eq. (5.40), we get 
dE 


dP 
mrt, a=Fot+ oO. 


Here II and ® are the momentum and the energy delivered to a 
particle through convection per unit time. Having composed the 
product 


>> 

TV = y? (lv — ©) = — 0p, 
we see that the quantity @° is the energy delivery rate in the frame 
in which the particle is at rest; it is equal to the rate of change of 


be particle’s rest energy. Indeed, differentiating Eq. (5.40), we 
obtain 


du, dm , 
map tuegratit it. (5.41) 
Multiplying the left-hand and right-hand sides of Eq. (5.41) by uy 
>> 
and taking into account that V(dV/dt)=0 and FV = 0, we get 


dm dE 
a = 2 SE ae 
OM = — Ilwu,—c ae - 
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Assuming that the genuine mechanical force must satisfy the 
condition 3 
af 
ma = Se 


the following quantity should be treated as a mechanical force 
(see Eq. (5.41)): 


du dm o° Nu, 
mae =O a GHB tM — ae HB tO ty 7 


in the case of the convective transfer of momentum and energy. 
When the genuine mechanical force is absent, the mechanical 
“reactive” force should be taken into consideration: 





c 


Il,a 
R,=0,+u,—#- 


>> 
This force satisfies the condition RV = 0. 

In a particular case the momentum II can be attained not by 
mechanical means but, for example, via radiation or heat ex- 
change between the particle and the environment. In the case of 
a pure heat transfer the 4-momentum of heat transferred to the 
particle during the time dt is equal to 6Q,=[Ldt= (dp, i(6Q/c)). 

Consequently, 6p = Mdt, 6Q = Mdt are the momentum and 
the energy of the heat transferred during the time df. In this case 
the 4-momentum of the transferred heat is a 4-vector. 

§ 5.4. The relativistic expression for a particle’s energy. Note 
that Eq. (5.31) can be obtained not only as the fourth component 
of Eq. (5.23), but directly from Eq. (5.37b) and exactly in the 
same way as Eq. (5.20) follows from Eq. (5.19) in classical me- 
chanics. To make sure, let us multiply scalarwise both the left- 
hand and right-hand sides of Eq. (5.38b) by v. We obtain 


: 1 ov oil 
mov = [Fe — = (Fo)] =>) Po. 


This is just Eq. (5.31) with Eq. (5.14) taken into consideration. 
On the other hand, the fourth component of Eq. (5.23) turned out 
to yield the energy conservation law. The expression for energy 
given by Eq. (5.32) differs essentially from the classical one. In 
Newtonian mechanics the energy of a motionless free object (i.e. 
an object possessing no potential energy) is assumed to be zero, 
so that the kinetic energy is clearly defined as 7 = mv?/2. In re- 
lativistic mechanics the total energy of a free particle is defined 
as & = mc*y. We call this energy “total” because it includes the 
energy of a resting object (the rest energy mc?). But we mention- 
ed earlier that Eq. (5.31) defines the energy with an accuracy of 
a constant; consequently, having selected the appropriate value 
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for it (by assuming &) = — mc?), one could have regarded, as 
in Newtonian mechanics, the energy of a resting object equal to 
zero. But it is impossible to do so in the STR. In STR mechanics 
one should not forget about the transformation rules for various 
quantities and the principle of correspondence with classical me- 
chanics: many classical and relativistic quantities must coincide 
in the extreme case B > 0. The Lorentz transformation is known 
to turn into the Galilean transformation at small relative veloci- 
ties of reference frames (B +0, [ > 1); at small velocities of par- 
ticles (B +0) a three-dimensional relativistic momentum turns 
into a three-dimensional classical one, myv—+ mv. Suppose we d2- 
termined the total energy of a free relativistic particle in the form 
& = mc*y+C; then in the extreme, case B > 0 we should obtain 
& =mc?+C. Let us examine now the transformation of the 
4-momentum components on transition from one IFR to another. It 
is performed in accordance with the following equations: 


Pi =P (Pi +iBPy), P2=Pe, Pa=P3, Ps=I(Ps—iBP,). (5.42) 


Substituting the values of the 4-momentum components P\, Pe, Ps, 
and P, from Eq. (5.21), we obtain 


’ B / 7 7 . 
p.=0(p,->&). Py=Py P2=P, & =V(F+Vp,), (5.43) 


where p,, Py, Pz and p‘, pi), p, are the components of the threc- 
dimensional relativistic momentum p = myv. In the extreme case 
corresponding to a transition to classical mechanics when B +0, 
fp -+>0O and py >muv,;, py > mux, € > me’, the first relation of 


(5.43) would lead to mu; = mu, — mV — . But the latter equa- 


tion must yield the classical law of velocity summation: v,= 
= vu, — V. It will indeed be the case if C—=0. That is how we prove 
the validity of Eq. (5.32). It should be noted that the principle of 
correspondence between the classical and the relativistic expres- 
sions for energy is not valid only because in the framework of the 
Newtonian mechanics one could not detect the existence of the rest 
energy, and the additive constant was chosen without allowing for 
the rest energy (see below). 

It is seen from Eq. (5.32) that the total energy does not turn 
into zero even when the velocity of an object is equal to zero 
(y = 1 at v=0). The energy of a free particle in the frame in 
which it is at rest is equal to mc? and is called the rest energy 
&>. Although we dealt with a particle until now, its elementary 
character was not discussed. Therefore, all equations derived are 
quite applicable to any complex object (system) composed of di- 
verse components. Naturally, m will then represent the total mass 
of an object, and v the velocity of its motion as a whole. The 
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equation &) = mc? is valid for any object that rests as a whole. 
Consequently, the object’s rest mass defines the total energy con- 
fined in it regardless of the origin of this energy. 

In classical mechanics the energy of a resting object may be 
either positive or negative: it is defined with an accuracy of a 
constant value. In relativistic mechanics the energy of a free ob- 
ject (or the energy of a closed system) is always positive and 
related to the object’s rest mass; the rest mass of an object defines 
its rest energy. The inertia of an object proved to be a measure 
of the object’s energy. Whenever the object’s energy changes by 
A&, the mass of this objects varies by Am = A&@/c?. 

The question arises as to how such a high energy could remain 
unnoticed, that is the rest energy confined within an object. In 
fact, one ere) of a substance contains about 10?! ergs. The total 
amount of energy confined within a system is not so essential, 
however, as the part of energy that can be utilized. Although any 
mass stores a huge amount of energy, its realization is far from 
being easy. Only recently people have learned to employ the 
atomic energy. Until a short while ago the rest energy had not 
been realized (and, accordingly, mass had always been constant). 
Since the realized energy always represents a difference of ener- 
gies, the existence of the rest energy did not manifest itself in any 


way. 

singe c? is very large, a change of mass accompanying a change 
of an object’s energy is very small and cannot be experimentally 
detected even though weighting was always one of the most 
precise kinds of measurement. For example, | kg of water heated 
by 100 degrees gains only 5-10-° g of mass. Such a minute mass 
change cannot be detected even by means of a most sensitive 
modern balance. The formation of nuclei, however, involves quite 
perceptible mass changes; moreover, this mass defect determines 
the binding energy (see § 5.6). 

In relativistic mechanics it is natural to define the part of the 
particle’s energy that turns into zero at v = 0 as a kinetic energy. 
It is obtained by deduction of the rest energy from the total energy 
of a particle: 

T= & — mc? = mc? (y — 1). (5.44) 


The same result can be obtained when a force’s work is cal- 
culated according to the equation of relativistic dynamics: 
dT = Fu dt = vd (myv) = myv du + mv? dy = 
= myv dv + mvy'p (6 dp) = my5v du (<> + =) = my'v dv. 
Hence 
T=m \ y°v du = mc’y + const. 
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li T=0 at v=0 (ie. at y = 1), const = — mc?, whence it fol- 
lows again that T = mc?(y— 1). 

Let us find the conditions under which Eq. (5.44) is converted 
into an expression for the classical kinetic energy. Expanding v 
In series 


Vi— 





y= 





=142P +584 


we see that 
mu? 3 vu! 


i.e. the classical kinetic energy has the meaning inasmuch as 
Bp < |! and the term £4 can be ignored. 

Having designated the classical kinetic energy by 7.: and the 
relativistic one by 7,e:, Eq. (5.45) can be rewritten as follows: 


3 of 
Tre =T ea tama t+ rena 


T a 





The ratio T,ei/Tc: is given by the following expression: 


Tt =| + ++ (terms of the order of f* and higher). 

In nuclear physics, where “the limits of applicability of New- 
tonian mechanics” have to be defined more accurately, it is as- 
sumed that Tre: = Tc: provided the second term on the right-hand 
side of the equation is less than one per cent (remember that 
f < I, and the series descends rapidly). Consequently, the bound- 
ary limit (which is, of course, conditional) may be found from 


the equation +p? = 0.01, B ~ 0.12. 


Since y = y(v), and y = &/mc? for a particle, one may speak 
of relativistic velocities when the total energy & of a particle ex- 
ceeds appreciably its rest mass, i.e. when the condition (&/mc?) > 
'> 1 is met. Surely both conditions are qualitatively equivalent, 
but it should be borne in mind that when the velocity approaches 
its limit (c), the energy tends to infinity. Therefore, very small 
velocity variations in the vicinity of ¢ may alter radically the 
particle’s energy. 

In nuclear physics it is more convenient to deal with energies 
of particles rather than with their velocities. Of course, particles 
of different masses will have different energy limits defining the 
applicability of classical mechanics. For example, in the case of 
electrons this limit is equal to 3 kev, while for protons 7 Mev (it 
would be useful for you to try to obtain these numbers yourself). 

Eq. (5.32) may be rewritten as 


F=ys, (5.46) 
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if the zero energy &) = mc¢ is taken into account. The rest energy 
&, comprises all kinds of energy possessed by an object (or a 
system). Eq. (5.46) shows that all kinds of energy increase y-fold 
on transition from the proper (co-moving) frame to any other 
inertial frame. There was nothing of the kind in classical me- 
chanics. On the other hand, the total energy of a particle (Eq. 
(5.32)) and its kinetic energy (Eq. (5.45)) grow without bound 
when vc. This result has a plain physical meaning. A particle 
whose rest mass differs from zero cannot attain the velocity equal 
to c. This can be inferred from the fact that the particle would 
require an infinite energy in order to achieve that velocity. Here 
the limitedness of the velocity of light in vacuo shows up again. 
When treating light quanta (photons) as relativistic particles (see 
§ 7.6), one should bear in mind that they belong to another class 
of particles and could not come into being as a result of acceler- 
ation of conventional particles, that is through a dynamic transi- 
tion. The limit transition v > c is carried out in nature, but the 
terminal point of this transition (v = c) is never reached. 

§ 5.5. A 4-vector of energy-momentum. The fourth (or zeroth) 
component of the 4-momentum of a free particle has a direct bear- 
ing on the particle’s energy. This is evident from the following 
straightforward transformation: 


P,=micy =— my =+ 6; (a) PO = mey ==, (b) 


Consequently, the 4-vector P is referred to as a 4-vector of energy- 
momentum of a particle. From Eq. (5.7) and from the fact that 


> > 
P = mV it follows that 
P2—— mc?; (a) | P2—= mc. (b) (5.47) 


The component transformation law for a 4-vector P is given by 
Eqs. (5.42) and (5.43). It remains to rewrite Eq. (5.22) in the 
final form 


P(p,i=); (a) | P(4.2). & (5.48) 


where p = myv. Let us consider a particle in the reference frame 
in which its relativistic 3-momentum p = myv is equal to zero. 
The frame in which a particle is at rest (v= 0) may be called 
a proper reference frame. Let the particle’s energy in this frame 
be equal to &®° Then in the frame K’ in accordance with Eq. 
(5.43) : 


, 1 ts .B Bg & 
& =f, Sa eS =— 7b. (5.49). 
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It is seen from Eq. (5.49) that the energy transferred by a par- 
ticle is associated with the origination of a momentum. Indeed, 
in the proper frame a particle possesses the energy &° which does 
not move in space. The momentum of a particle (an energy car- 
rier) in this frame is equal to zero. In the frame K’ the particle 
moves; its velocity is equal to —V. This means that the energy 
“flows” at this velocity. Eq. (5.49) for px shows that the energy 


flow involves the momentum p= —T-V. This momentum 


coincides with a relativistic three-dimensional momentum because 
according to Eq. (5.32) #°%/c? = m; the particle’s velocity is equal 
to —V, and I coincides with y in this case. 

Thus, an energy carrier (a particle, in this case) needs a mo- 
mentum to be attributed to. Although we have obtained this re- 
sult for a particle, it has a general significance; we shall come 
across it again when examining an electromagnetic field (Chap- 
ter 6). 

We would like to emphasize here that the very fact of integrat- 
ing certain quantities into a 4-vector points to an intimate con- 
nection between them. The quantities which are 4-vector compo- 
nents (usually a 3-vector and a scalar) constitute in a sense a 
closed combination: to calculate the energy and momentum of a 
particle in the frame K’, one should know those in the frame K 
(see Eq. (5.43)). The fourth (or zeroth) component of a 4-vector 
of energy-momentum cannot turn into zero. If in some frame the 
energy and momentum of a particle turn into zero, they are equal 
to zero in any other reference frame. This is where relativistic 
and classical relations differ fundamentally. In classical me- 
chanics the energy and momentum of a stationary particle are 
equal to zero. 

The square of a 4-vector is an important invariant. Let us write 
it out: 

2 > 2 

B? =p? — 5 = — mic? (a) = 5 — p?=mic?, (b) (6.50) 
(we have used Eqs. (5.47) and (5.48)). Needless to say that it is 
equal in magnitude but opposite in sign in these two cases. What 
is important, we have found an invariant relationship between a 
relativistic momentum and a relativistic energy of a particle; it 
is essential in this case that the invariant equation (5.50) de- 
termines an invariant mass, a rest mass of a particle. 

From Eq. (5.50) one can express the particle’s energy in terms 
of its momentum: 


GF =e p t+ me’. (5.51) 


The particle's energy expressed in terms of its momentum is 
referred to as the Hamiltonian function 2% of a particle. Thus, 
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Eq. (5.51) gives the Hamiltonian function of a particle. It is com- 
mon knowledge that the derivative of the Hamiltonian function 
with respect to the momentum components yields the velocity com- 
ponents of a particle: 
a8 _ 0% _ dx 
Opx Op, dt 


Eq. (5.52) may be derived by differentiating Eq. (5.50): 





=U,,..., OF D=Vpe. (5.52) 


px dp, + pydpy + p.dp, = 7 8 dB, (5.53) 
whence 
=F: Gea Pu we Gea Peg — (554) 
But inasmuch as & = mc*y, and p = myv, then 
p=50 (5.55) 


for a particle, and we get back to Eq. (5.52). 

In what follows we shall need a formula which is easy to 
derive from Eq. (5.52): multiplying the left-hand and right-hand 
sides of Eq. (5.52) by dp, we obtain vdp = d&. Those who are 
not particularly inclined toward a treatment of gradients should 
note that the same result follows at once from Eqs. (5.37a) and 
(5.37b) both in the classical and relativistic cases although a 
momentum is defined there in different ways, of course. Multi- 
plying the left-hand and right-hand sides of Eqs. (5.37a) and 
(5.37b) by vdt, we obtain Fdr, i.e. d&, in the right-hand side, 
and v dp in the left-hand one. Thus, in both classical and relativ- 
istic mechanics we have the same formula 


d& =vdp, (5.56) 


but the definition of energy and momentum will be different. 

In the simplest case of a unidimensional motion & = &(p) is 
determined according to Eq. (5.51). In the plane of the variables 
&, p velocity is determined via a tangent of a slope angle of a 
curve & = &(p) at a given point. When p> me, Eq. (5.51) be- 


comes 
& =cp. (5.51’) 


Particles with a finite rest mass are called ultra-relativistic if the 
last relation holds for them. Eq. (5.51’) is valid both for ultra- 
relativistic particles and for photons. We shall see in § 7.6 that 
light quanta (photons) may be treated as relativistic particles. 
But now it is worthwhile to point out that they are quite special 
particles. In any IFR these particles possess a finite momentum 
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and a finite energy. Their velocity in vacuo is the same in any 
IFR. They cannot originate from any of the particles possessing 
a finite rest mass by means of their acceleration. Finally, we infer 
from Eq. (5.50) that the photon’s rest mass is equal to zero. 

Let us consider some more relationships for a particle located 
in an external potential field. Since the field is assumed to be 
propagating at a finite velocity, the primary principle of the STR, 
the finiteness of the signal transmission velocity, is observed. As 
to the force acting on a particle, it is defined from the magnitude 
of the potential function at the point in which the particle is 
located (the field is stationary). 

When a particle is located in a potential field, then Fu di—=—dU 
and Eq. (5.31) turns into d(mc*y) = — dU, whence follows the 
law of total energy conservation for a relativistic particle in a 
potential field: 

me’y + U =const. (5.57) 


(The energy is total in the sense that the sum of the relativistic 
energy and potential energy of a particle remains constant.) In 
relativistic mechanics the kinetic energy is equal to mc?(y — 1); 
having altered the magnitude of the constant in the right-hand 
rhe by mc?, the energy conservation law can be rewritten in the 
orm 

mc? (y — 1) + U=const. (5.58) 


When a particle is located in a conservative field, its velocity 
and potential energy may change in the course of motion but the 
value of & = mc*y + U remains constant (see Eq. (5.57)) in that 
it is time independent in a given JFR. The quantity & may be 
called a total energy of a particle in a conservative field. Surely, 
this quantity remains constant in any IFR but changes its (in- 
variable) value for another on transition from one IFR to an- 
other. The definition of 2 4-momentum for a particle in a conser- 
vative field as P=mvV holds good, but Eq. (5.48) is to be re- 
placed by P| myo, <(€—U)]. since myc = — myc? = oat 


as il is seen from Eqs. (5.21) and (5.57); using the relation 





P? — — mc? Eq. (5.51) yields the Hamiltonian function for a 
particle in a conservative field: 
F=H=ceVp'+ me? + UV. (5.59) 


Just as in the case of a free particle, a 3-relativistic momentum 
can be expressed via the energy, velocity and potential energy: 


| —U 
p= myo = = me*yo = cies 0. 
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The transformation of the fourth component of the energy-mo- 
mentum vector for a particle in a potential field shows that ac- 
cording to Eq. (5.42) (and making use of Eq. (5.10) as well) 
the total energy is equal to &’ = mc*y’ + U’ in the frame K’, as 
one would expect. 

§ 5.6. The rest mass of a system. The binding energy. So far 
we have been considering mechanics of a “particle”, ie. the be- 
haviour of a certain single unit. However, the “elementariness” 
(indivisibility) of a particle was, in fact, nowhere assumed, so 
that all conclusions may be transferred to more complex systems 
consisting of “subsystems”. 

The rest mass M of a complex system is defined in accordance 
with the general formula of Eq. (5.50): 


Mc? == = — p? (5.60) 
ct , : 


where E is now the total energy of a system and P its total mo- 
mentum. 

Let us confine ourselves to the simplest systems consisting of 
individual particles. First suppose that the particles do not in- 
teract with one another. Then the energy of such a system will 
be the sum of energies of particles comprising the system: 


E=)&,. (5.61) 


The additivity of energies signifies the absence of interaction. The 
total momentum of a system always adds up vectorwise from the 
momenta of individual particles, i.e. it is always additive: 


In this case the rest mass of the system can be written as fol- 
lows: 


M2¢? === Coes = (2 rs) (5.63) 


i 


In order to find the relationship between the rest mass of a 
system and the rest masses of particles comprising the system, one 
should pass over to the reference frame in which the total mo- 
mentum of the system is equal to zero: P = 0. Then from Eq. 
(5.63) we obtain 

ye 
t 


c? 


M= 





(5.64) 
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We sec that the rest mass of a system is expressed as the su:n 
of energies of constituent particles (divided by c?). But the energy 
of an individual particle can be represented according to Eq. 
(5.44) as the sum of its rest energy and kinetic energy 


om — m,c? + T;. (5.65) 
Then in accordance with Eq. (5.63) we get 
M=)m+4> 1. (5.66) 
i i 


Eq. (5.66) yields the important result: the rest mass of a sys- 
tem exceeds the sum of rest masses of constituent particles by 
the total kinetic energy of the particles (divided by c?) estimated 
in the reference frame in which the total momentum of the system 
is equal to zero. 

This way we reach a conclusion that in relativistic mechanics 
the rest mass of the system composed of non-interacting particles 
is not an additive quantity. Such a property of mass is uncommon 
in classical mechanics. It is tempting to bring in some new 
definition of mass for constituent particles, so that the rest mass 
of a system will add up from these new masses called “relativis- 
tic’ sometimes. It is easy to see how this can be done. In the 
reference frame where P = 0 we obtain, according to Eq. (5.64): 


Mc? = » &,= » myc. (5.67) 


Consequently, one can write down 
M=)imy,= Dm‘, (5.68) 


where mié! =m,y, is a relativistic mass. That is how we realize 
additivity (which is by no means obligatory), but at the same 
time we clear the way for various delusions. In fact, the introduc- 
tion of a relativistic mass for a particle creates an illusion that 
the increase of the particle’s energy, or “relativistic mass”, ac- 
companying the increase of its velocity (momentum) is associated 
with the changes of the internal structure of the particle. But 
surely, there is no such thing at all (to make sure, one may just 
pass over to another frame without approaching the particle) 
As a matter of fact, the energy grows with the increase of velocity 
due to the special properties of the 4-space-time coming through 
in the Lorentz transformation. 

In terins of the four-dimensional approach the term “mass” 
refers to the invariant norm pf the 4-vector of energy-momentum. 
By bringing in a relativistic mass we actually apply the term 
“mass” (with an accuracy of a factor) to the time component of 
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the 4-vector of energy-momentum, which is, as we know, the 
energy. However, the energy and the rest mass that we intend to 
use represent essentially different physical concepts. 

Energy is a relative quantity; it depends on the IFR in which 
the particle or the system of particles is considered. The rest mass 
remains the same in all IFRs; it is an absolute value of a 4-vector. 
The time component of a 4-vector (energy) coincides with iis 
absolute value (rest mass) only when the spatial components of 
this 4-vector are equal to zero (which means that either the par- 
ticle’s momentum or the total momentum of a system of particles 
equals zero). And only in the case of the energy coinciding with 
the rest energy it is proportional to the rest mass (with the con- 
stant coefficient c?). 

Thus, we can ascribe the precise four-dimensional meaning to 
the momentum, energy and rest mass of a particle or a system, 
provided the first two quantities are treated as components of a 
4-vector of energy-momentum, and the last quantity as a norm of 
the same vector. The methodical aspects of the problem are dis- 
cussed in Supplement IV. 

Let us examine now a system composed of interacting particles. 
Eq. (5.63) remains valid, of course. Eq. (5.61) should be, how- 
ever, replaced by 


E=)8 +U, (5.69) 


where U denotes the interaction energy of particles. This energy 
is defined as the work required to disjoint the system into “initial” 
non-interacting parts. In a stable system U <0 since a “stable 
equilibrium” state is characterized by a minimum of energy. In 
such a system the quantity U is referred to as a binding energy. 
Although the explicit analytical expression for the interaction 
energy is often rather difficult to write out (see § 5.8), its magni- 
tude can be estimated. From Eq. (5.60) we obtain for the refer- 
ence frame in which P = 0: 


bz &,+U 
M=——,—. (5.70) 
or expressed otherwise 
1 U 
M=)'imtad uta (5.71) 
i i 


where we made use of the relation &; = ‘myc? + T; which is cor- 
rect for every individual particle. 
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If the condition }}7,;< U is met, ie. the total relativistic 
i 
kinetic energy of particles is small, then 


M= 2 m, + (U/c?). (5.72) 


It is evident from Eq. (5.72) that-.in the system of interacting 
particles the difference 


AM=)\m,— M, (5.73) 


customarily called a mass defect, is always different from zero. 
In a stable system U <0 and AM > 0. From the mass defect 
one can calculate the binding energy: 


U=AM- C2. (5.74) 


Such a calculation is meaningful only when binding energies 
are substantial. Precisely such a case is realized in atomic nuclei. 
Ii is well known that atomic nuclei are very stable, this being the 
evidence of their considerable binding energy. Atomic nuclei are 
composed of protons and neutrons, with each nucleus possessing 
a quite definite number of protons and neutrons. The masses of 
protons and neutrons in a free state (outside a nucleus) can be 
experimentally determined. The mass of any atomic nucleus can 
also be experimentally found. The difference between the sum- 
marized mass of free protons and neutrons comprising the nucleus 
and the measured mass of the nucleus yields the mass defect 
and, according to Eq. (5.74), the binding energy. Exactly in this 
manner the binding energies of nucle: are found in atomic phys- 
ics. 

Let us write out separately expressions for ultra-relativistic par- 
ticles (v ~ c). In this case Eq. (5.51’) holds: 


& =cp (5.75) 


and consequently from Eq. (5.51) we shall obtain m= 0. How- 
ever, for two or more particles we shall obtain from Eq. (5.63) 


M?c? = (2 pi) — (2 pi) + 0, (5.76) 


since )) &;=c >, p;. This points out that the rest mass of a 
i 


system composed of particles with the zero rest mass is not equal 
to zero. There is nothing surprising in this since rest masses do 
not add up. 

Finally, a few words about “composite” subsystems. When 
determining a rest mass of a complex system according to Eqs. 
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(5.70) and (5.64), one should take a total energy of the system. 
Assume that the system involves an electromagnetic field as well. 
Designating the energy of the electromagnetic field by W, we 
obtain from Eq. (5.70): 

4 


M=>)im+4>74+444. (5.77) 
i i 


From this it is inferred that the energy of an electromagnetic 
field, as any other energy, makes its contribution to the rest mass 
of a system. 

§ 5.7. Some problems of relativistic mechanics of a particle. 
Within the framework of a given inertial frame of reference there 
is no need to resort to four-dimensional relations; it is sufficient 
to use the three-dimensional Eq. (5.26) and also Eq. (5.31). We 
shall recall that the general appearance of the second law of 
Newton remains invariable and only a momentum and energy of 
a particle are defined otherwise. This new definition, however, 
changes substantially the properties of the solution of the problein 
as compared to that obtained for the same preblem from classical 
mechanics equations. In particular, the solution of any problem 
of relativistic mechanics does not permit of obtaining the velocity 
of a particle exceeding that of light. Some other distinguishing 
features come up as well; to elucidate them, we shall consider a 
few problems, solving them with the aid of both the classical and 
relativistic motion equations. 

Since the equations obtained will not be needed afterwards, am 
individual numeration for each problem is adopted in this sec- 
tion. 

1. The elementary solution of the problem of the unidimensional 
motion under the action of a constant force. The motion equation 
takes the form (a classical equation on the left side and a rela- 
tivistic one on the right side): 


ar (mv) =F; (a) | gp lmy)=F, (b) ( 


where F = const. Integrating with the initial condition v = 0 at 
t = 0, we obtain 
mu=Ft; (a) | myo=Ft (? ==) (b) (2) 
A/F 
A velocity as a function of time is found from Eqs. (2a, b) al- 
gebraically: 
Ft/m U, 


Ver = (a) | same y FY ca wae TOT (b) 
me, ec (3) 
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We shall discuss these results in Problem II as it will be- 
come evident that Problem II is just a variant formulation of 
Problem I. 

Il. The rectilinear and uniformly accelerated motion of a par- 
ticle. If a particle moves along the x’ axis (vy—=v, =O) in the 
frame K’, then vy = vz =0 in any other IFR according to Eq. 
(3.26). Let us consider a motion of -a particle along the common 
x, x’ axis with a constant acceleration. If the acceleration of the 
particle does not vary, the acceleration components in the ref- 
erence frame in which this particle is at rest (the proper reference 
frame) are (wo, 0, 0, 0). The quantity wo is a customary thre»- 
dimensional acceleration directed along the x’ axis. The square 
of a 4-vector acceleration is an invariant, so that the following 
condition must hold in all reference frames for a uniformly ac- 


celerated motion *: 
du, 3 
(Sty = wz, ) 


where wo is the magnitude of the three-dimensional acceleration 
in the proper reference frame. Surely, this condition differs from 
the requirement 0 = 0**. The 4-velocity components for a unidi- 
mensional motion in any reference frame acquire the form 
V (iu = yx, 0, 0, icy) whence, according to Eq. (5.23), it follows 
that $2 = §3 = 0. Consequently, the two equations that remain in 
an arbitrary IFR are 


ditty : du, a3 
ma =F, or ale VFo, 
We denoted here Fy = F and v, =v. Eq. (1) will take the 


form 





du 2 
(Sy = nt = (y2F? ot a VF!) = mm wy = const. 
The motion equation will be written as 
d d 
= = Yu, or aia = Wy. (2) 


In the three-dimensional notation 
d v 


uv 
a= =Wo oF Sr Tt $C. (3) 
y=) A|\- = 


* In this chapter we make use of the 4-space-time whose fourth coordinate 
is ,imaginary. 

* If wo = const, 0 is variable. In classical mechanics an object acquires a 
constant acceleration under the action of a constant force, whereas in relativis- 
tic dynamics a constant force lg a constant acceleration to an object only 
in an instantaneous co-moving reference frame, 
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Solving Problem I, we got convinced, however, that these equa- 
tions could be obtained at once if we substituted the constant 
force F = mw, into the relativistic motion equation. But wo is not 
the same as 9; this can be inferred from Eq. (2). 

If the initial condition is such that the velocity v is equal to 
zero at ¢ = 0, then C = 0 and expressing the velocity through wo, 
we obtain from Eq. (3) 


dx 


— Vet ig le 

a eer ra ae 
Ver 
a/i+(28) 


where v<: denotes wof. Integrating the latter relation at the initial 
conditions, x = 0, ¢ = 0, we obtain the following expression: 


ot (ie - 2) 9 


The solution of the classical motion equation for a constant 
force and the identical initial conditions takes the form 


Vet = Wel, X= wol?/2. 


(4) 





If the classical velocity grows indefinitely with time, it follows 
from Eq. (4), due to the obvious inequality 
a<a, if A>0O, 


aR 
—s M/A 
alate Aa? + x 


that the relativistic velocity always remains less than c as it, in 
fact, must be in accordance with the principle of ultimate velocity 
of signal transmission. The relativistic equations (4) and (5) for 
the velocity v and coordinate x turn into classical ones at 
Veale < 1. If one rewrites Eq. (4) as v=c/V1+c’/wit’, it be- 
comes evident that v > c when ¢ > oo. 

Let us find the relationship between the coordinate time ¢ and 
the proper time t of a particle. If one chooses the common origin 
of the tirne count fo = t = 0, then 


t t re rr ae 
wv 2 72/2 
ee EY ec 


0 0 1+ 





c2 


ee Ee 
Sei in VE +4 ) (6) 
(see Eq. (3.16) ). 


6* 








164 Special Theory of Relativity 


Ignoring unity in the radicand as compared to Wof/c, we ob- 
tain, when f > oo, 


We see that the proper time of an object moving with a uni- 
form acceleration flows substantially-slower than the time in a 
“motionless” reference frame relative to which the motion is con- 
sidered. 

There remains, of course, the physical question as to what clock 
counts the particle's proper time given by Eq. (6) since the rela- 
tion dt =(l/y)dt pertains to a clock moving uniformly and 
rectilinearly. We examined this problem in detail in § 3.3. 

in conclusion we should note that a relativistic uniformly accel- 
erated rectilinear motion is also called hyperbolic since the time 
dependence of the path travelled (see Eq. (7) below) represents 
hyperbola in terms of geometry. If a charged particle moves in 
a uniform and constant electric field, or a heavy particle in a 
uniforin and constant gravitational field, the motion is hyperbolic. 
Let us write out finally the principal equations describing a hyper- 


bolic motion: 
ome [Vis 1) 
* dv 


Wol Wo 


i so—_ > 


V1 + (wot/c)® ' oer? V+ (wot/e)? ” 


From the expression for the time derivative of the velocity one 
may see the difference between a relativistic and non-relativistic 
“constant” acceleration. 

ltl. The motion of a charged particle in a constant uniform 
electric field. Let us choose the following initial conditions: at the 
moinent ¢ = 0 the coordinates of a charged particle x9 = yo = 0, 
and its velocity vo is perpendicular to the field E. This corres- 
ponds to the problem about a particle flying into a charged ca- 
pacitor parallel to its plates (Fig. 5.1). Let us direct the x axis 
along E and the y axis along vo. Then the motion of the particle 
will take place in the plane (x, y). As far as it is possible, we 
shall not discriminate between classical and relativistic equations 
of motion, writing them in the common form: 

dp 


ar ee, 


(7) 


where F = eE is the force exerted by the electric field on the 
charged particle. In components: 


Dy=eE, py=0. 
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Whence the momentum is found by integration: 
px =eEt + pox, Dy = Doy- 


But according to the initial conditions both in classical and rela- 


LLL LLL LL hid lie, 
% 
eE ? 
z+ 
z 


Fig. 5.1. An electron flying into the uniform electric field of a capacitor is at 

the origin of coordinates al the moment ¢ = 0 The force exerted by the field is 

directed along the x axis. The initial velocity of an electron vo is directed along 

the y axis The classical solution of the problem coincides with that of a prob- 

lem of the motion of a heavy point thrown horizontally in the gravitational field 
at the velocity vo 


tivistic cases pr, = 0 and poy = po at t = 0. Consequently, one 
may write 
pxr=eEt, Poy = Po. 


From here on we should take into account the difference in de- 
finilions of a momentum: 


classical: relativistic: 
Prem ey: p=mnyo. 
We have In this case 
gait teen CEk 
Oe atm a) p= p+ py = (eEt) + pj. 
d 
Uy = t= u. The particle’s energy & will 


be defined as 
Integrating, we get 


ek 1? G=ceV p+ mi= 
= et Xo ae 
y = vo + yo = 4/mtc" + pic? + (ceEly = 
But according to the initial =) 82+ (ceEt?. 
conditions xo=yo==0 at f=0, : 
we finally obtain Using Eq. (5.55) 


E t? 2 c 
x=, y= Ul. (*) Vrel =“ Pr (1) 
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The velocity of a particle we obtain 
d 2 
v=o += (= =) 2+ of ore a ae Pe 
increases indefinitely with == 
time. The particle’s path ob- vb: + (ceEt) 
tained from Eq. (*) by elimi- : 
nation of the time ¢ is repre- a (2’) 
sented by a parabola “Vy ey 7) (= y- 
jaa (2) | where v%¢! is defiined accord- 


Qmv2 re . 
0 ing to Eq. (1’). It follows from 


Eq. (2’) that, as in the pre- 
vious section, v’@ is always 
less than c because 


pret — 4¥ oe? 3} 
dt N83 + (ceEt? 
diminishes with time. Inte- 
grating Eq. (2’), we obtain 
with regard to the initial con- 

ditions 


x= += 5 + (ceEty — 


Integrating Eq. (3’) and tak- 
ing into account the initial 
conditions, we get 


oe mt 





y= - © Arcsinh 


giiniaite t from ae expres- 
sions = x and y, we have 


at 














Thus, when a classical path was a parabola, a relativistic one 
turneG out to be a catenary curve. In the case of v > c a catenary 
curve, however, turns into a parabola. Indeed, if v/e < 1, we have 
y ~ 1, po = moo, and &o = mc*. Besides, at small x one may 
assume cosh 6 ~ | -+ 6?/2!, whence 





which is the parabola of Eq. (2). nt bgsaie as well as all 
subsequent ones, shows that problems of relativistic mechanics 
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do not require any velocity dependence of mass: the solution is 
obtained by integrating a motion equation. 

IV. The motion of a charged particle in a constant uniform 
Magnetic field. The classical and relativistic motion equations for 
a charged particle in a magnetic field 


a =e[vB] (1) 


are the same not only in their appearance. The point is that a 
magnetic field does not perform any work on a charged particle 
and the energy of the particle remains constant (see Eqs. (5.20) 
and (5.31)); of course, the expressions for energy in classical and 
relativistic cases are different. Using the relativistic relation 


& 
pra 
where & = const, Eq. (1) can be rewritten in the form 
d 2 
“ar = IeBi, (2) 


whereas from the classical definition of a momentum p = mv it 
follows that 
a 
Tr = 7 (Bi, (3) 
Therefore, relativistic and classical motion equations (2) and (3) 
are distinguished only by the constants standing in front of the 
vector product. We shall recall how Eq. (2) or (3) is solved. 
Orient the z axis along a magnetic field. Then B = Bk. Denote 
the constant factors appearing in front of the product [vk] in 
Eqs. (2) and (3) as follows: 
B 2B B 1 
O1 =—, Ope = er = Sy = 0, VT BP. (4) 
Now for definiteness let us solve Eq. (2). To define the vector 
product [vk], rewrite Eq. (2) in components: 


0, =OvU,, Uy=—Ov,, 0,=0. (5) 


It is expedient to resort to a complex variable in the plane 
(vx, vy). Multiplying the second relation of (5) by the number i 
and adding the result to the first one, we obtain Fv + iv,) = 
= — io(v, + iv,). 

This equation can readily be integrated: 


0, + iv, =ae—it, 
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where a is a complex constant. If it is written in the form a = 
= upre~'* with the real vor and a, the solution takes the following 
form: 

Vz t+ ivy = ve! OFF), (6) 


Obviously, vo; is a modulus of the complex number in the left- 
hand side of Eq. (6): y 
UG, = U5 + VI. 


Consequently, the magnitude of the particle’s velocity remains 
constant in the plane (x, y). Eq. (6) can be rewritten in the form 


4 (x + iy) = vpe-! t+, 
permitting of direct integration: 
xp iy = he Hotta-an, (7) 


Recalling the geometric representation of the complex quantity 
w= x-+iy = re'%, we see that a particle remains permanently 
on a circumference of a constant radius r = vo¢/w, while the angle 
between its radius vector and the x axis increases evenly with 
time: @ = wf + const. This means that the projection of the par- 
ticle’s motion on the plane (x, y) is a uniform motion along a 
circumference of the radius 


ro eB eB" (8) 
where p; is the projection of a momentum on the plane (x, y) 
and » is an angular velocity. As to the motion along the z axis, 
it follows from the third relation (5) that 


Z=2y + Voz. (9) 


From Eqs. (8) and (9) it follows that a charged particle in a uni- 
form magnetic field moves along a helical line whose axis coin- 
cides with the direction of a magnetic field, and whose radius 
is determined according to Eq. (8). The velocity of a particle is 
constant, as it should be in a magnetic field. If at the initial mo- 
ment the velocity of a particle in the direction of a magnetic field 
is equal to zero (voz = 0), the particle moves along the circum- 
ference in the plane perpendicular to the field. 

The quantity ,e: defines the cyclic frequency of rotation of the 
particle’s projection on the plane (x, y) perpendicular to the di- 
rection of a magnetic field. This frequency is referred to as cyclo- 
tronic. As we have seen, @c: = yOrei, i.e. the cyclotronic frequency 
of relativistic particles is less than that of non-relativistic ones. 
At small velocities y > 1 and wci > rei. 
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in conclusion let us consider an acceleration gained by a par- 
ticle in an electromagnetic field in terms of classical and rela- 
tivistic mechanics. From the general motion equation 


Fe {E + [oB}} 
we obtain in the classical case (p = mv) 
v1 = {E + [vB}}. 


In order to obtain an acceleration in the relativistic case we shall 
make use of Eq. (5.55), whence 


dp v d& & do 


“dt ct dt CF dt’ 
According to Eq. (5.31) d&/dt = Fv = eEv, and according to 
Eq. (5.32) &/c? = my, so that (cf. Eq. (5.38) ) 


Vr = {E+ [eB] —% (Bo) }= 


1 . . 
= 5 Pet — Gaye O(ED) = + 91 — (ED). 

The second term in the last link of the equation can be regarded 
as an emergence of a certain friction (proportional to the veloc- 
ity); owing to this, one may realize in qualitative terms that the 
acceleration of a particle decreases sharply as the particle’s veloc- 
ity approaches that of light. It is obvious, of course, that uv.) = 
= 0,e, with an accuracy to within B?. The motion of a charged 
particle in constant electric and magnetic fields is presented in 
detail in [9], § 22; we should only point out here that in the case 
of crossed (mutually perpendicular) fields for which E? — c?B?=% 
= 0 holds (cf. § 6.5), a transition to a certain inertial frame of 
reference may eliminate one of the fields and leave either an 
electric or a magnetic field. Then in this reference frame one may 
utilize the results obtained here. 

V. The reaction motion in relativistic mechanics. As in the 
previous problems, we shall be examining classical and relativ- 
istic cases simultaneously. As an example, let us consider the 
motion of a rocket which (together with the ejected gas) can be 
treated as a closed system. We shall! recall that the rocket pro- 
pulsion is brought about due to the fact that during each time in- 
terval it ejects a certain amount of substance at a definite velocity 
with respect to the rocket. In accordance with the momentum con- 
servation law the rocket shell with the left-over fuel acquires a 
momentum in the direction opposite to the direction in which the 
gas jet is ejected. Both in classical and in relativistic cases the 
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problem is easier to solve in the inertial frame of reference 
co-moving with the rocket. Since the velocity of the rocket chan- 
ges, we deal with the instantaneous co-moving frame. 


Let at the moment / a rocket 
mass (fuel and container) be 
M(t) and a velocity V(¢); dur- 
ing the time dt the rocket en- 
gine ejects a mass dM of gas 
at a constant velocity relative 
to the rocket, v. Write out the 
momentum conservation law 
in a co-moving frame (a unidi- 
mensional case): 


(M—dM)dV—vdM=o, _ (1) 


with dM > 0. Ignoring infini- 
tesimal values of the second 
order (dM-dV), we get 


dM dV 
7 ee (2) 


Eq. (2) is easily integrated 
(v = const): 


Veuln<, 


where C is an integration con- 
stant. Choosing the initial con- 
ditions: at t=0 V=0, M(0) = 
=Mpo, we finally obtain 


V=vint. (3) 


Eq. (3) determines the velocity 
V(t) of the rocket as a func- 
tion of the velocity of ejected 
gases and the change of the 
rocket mass (the mass of the 
burnt fuel is equal to Mop—M). 

There is, however, one note- 
worthy feature in this deriva- 
tion. At any moment of time 
the rocket has different co- 
moving inertial frames. The 
velocity increment in Eq. 2 
during the time interval dt is 


In a relativistic case one 
should consider not a velocity 
increment, but a velocity pa- 
ramheter increment, since ve- 
locity parameters add up (are 
additive), while velocities du 
not (see Eq. (3.37)). 

Write out the relation be- 
tween the velocity and the ve- 
locity parameter @ (see Eqs. 
(2.27), (2.28) ): 


tanhO=f, cosh@=y, 
sinn@= yp, (5) 


when examining the velocity 
of ejected gases, and 


tanhO=B, cosho=T, 
sinh® =TB, (6) 


when dealing with the velocity 
of the rocket. Obviously, 


& =mce*cosh®, p=msinh@. (7) 


Suppose that during the time 
interval df the mass dM is 
ejected at the velocity B=dv/c. 
The velocity of the rocket after 
the ejection of the mass dM 
increases by dB =dV/c. We 
shall express the increment dB 
via the velocity parameter in- 
crement (see Eq. (6)): 


dB = tanh (d6), (8) 


Now write down the laws of 
momentum and energy conser- 
vation. In a co-moving frame 
the momentum of a rocket “be- 
fore the ejection” is equal to 
zero (B= 0). After the ejec- 
tion of the mass dM the rocket 
acquires the momentum (M + 
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written for different inertial 
frames of reference. In transi- 
tion from one inertial frame of 
reference to another all veloci- 
ties (and velocity increments) 
add up in classical mechanics. 
Therefore, it is immaterial that 
Eq. (2) refers to different 
IFRs; the final velocity can be 
obtained by summing (inte- 
grating) velocity increments 
over the total time interval 
during which the velocity of 
the rocket varies from 0 to V: 


M V 
\sr=—s he. (4) 
Ms 0 


This is just the method that 
was used in derivation of Eq. 
(3). Hence it is clear how im- 
portant in this derivation is 
the additivity of velocities in 
transition from one IFR_ to 
another. 

In the foregoing calculations 
the mechanical energy conser- 
vation law was not used be- 
cause, on the one hand, it is 
insufficient (the thermal ener- 
gy is also significant here), and 
on the other, it is not neces- 
sary when the velocity of a 
rocket is calculated. 


+ dM) sinh (d6), while the 
momentum of the mass dM 
directed oppositely is equal to 
dM sinh(@). Hence, 


— dM sinh (8) + 
+(M-+dM)sinh(d6)=0. (9) 


Relativistic mechanics makes 
it possible to take into account 
any energy transformations, 
so that here we can write the 
energy conservation law as 
well: 


dMc?* cosh (6) + 
+(M + dM) c? cosh (d8) = Mc?- 
(10) 
But we may regard d@ as 
a small quantity, so that 
sinh(d6)—~ d@, cosh(d0)~ l° 
consequently, ignoring infini- 
tesimal values of the second 
order dM-d6, we obtain from 
Eq. (9) 


dM sinh (6) = M 8, 
and from Eq. (10) 
cosh @ = — I, 


(If) 


(12) 


Dividing termwise Eq. (11) by 
Eg. (12), we get 


do = —“% tanho, 


or 


do=—pSE, (13) 
where M == M(t) is the mass 
of the rocket and fuel at the 
moment ¢. Since in relativistic 
mechanics a velocity parame- 
ter is an additive value, the 
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final magnitude of a velocity 
parameter can be found by in- 
tegration: 


O=pin5*, = (14) 


where Moy is the mass of the 
rocket at the moment when its 
velocity is equal to zero. 

Eq. (14) also determines im- 
plicitly the velocity of the 
rocket at the moment when the 
mass of the burnt fuel is equal 
to Mo — M. 


It is easy to see that if a rocket moves at a non-relativistic 
velocity, Eg. (14) turns into Eq. (3). In fact, in this case B< 1 
and, consequently, tanh@ = B <1, whence tanh® = 6 and Eq. 
(14) coincides with Eq. (3). Just as in all solutions of rela- 
tivistic mechanics, the velocity of a rocket cannot exceed that of 
light c. Even if we manage to burn all the mass of the rocket 
together with the fuel, ic. M — Mo, In(Mo/M) -— oo. From Eq. (14) 
it only follows that @ + oo (with the maximum value of £ being, 
of course, equal to 1). But B = tanh®@ and at 8 oo tanhO— 1, 
ie. Vc. 

Surely, the higher the ejection velocity, the more effective is 
the rocket performance. Can the ejection velocity be made equal 
to c, i.e. B = 1? It can be, provided light serves as a reaction gas: 
only photons and neutrinos can move at the velocity c. These two 
sorts of particles are peculiar in that their rest masses are equal 
to zero (§ 7.6). The zero mass, however, is seen directly from 
Eq. (10) in this case. As u->c a velocity parameter satisfies the 
condition tanh@ = —p—>1. But with tanhO +1 coshO— oo, and 
in order to satisfy Eq. (10) at a finite value of M, it is necessary 
that dM = 0. 

The more detailed analysis of photon rocket capabilities shows 
its inapplicability for long-range space flights (see [I1]). 

VI. Colliding beams. The progress in nuclear physics depends 
essentially on how high are the energies of interaction between 
elementary particles that could be observed. There have been two 
sources of high-energy particles so far: cosmic rays and accele- 
rators. The accelerator designers are still very far from master- 
ing the energies which particles in cosmic rays possess, while 
the systematic research in high-energy physics has been restricted 
by the range of energies covered by accelerators. Accelerators are 
complex and expensive machines whose construction continues 
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for years and whose cost amounts to an appreciable portion of a 
national budget of any highly developed country. 

Suppose that in a laboratory reference frame particles are accel- 
erated to the energy &. We are to succeed in colliding these par- 
ticles with other particles of the same type (for example, we 
examine proton-proton collisions). The proton beam, possessing 
the total energy & in a laboratory reference frame, can be directed 
onto a target containing hydrogen in which protons are practically 
motionless. It is sufficient to consider a collision of one bombard- 
ing and one resting proton. Then the energy of the system con- 
sisting of these two particles is equal to & +m (we assume 
c = 1 in this section). The following question arises: is it possible 
to increase essentially the interaction energy by making two 
beams, each consisting of particles possessing the energy & (in 
a laboratory frame), move toward each other? How much higher 
will be the “useful” interaction energy? For the sake of diversity 
we shall be treating this problem in rather unusual time units, 
which are light metres (see Chapter 2). 

To simplify the problem, we shall not speak of beams any more, 
but consider only two particles. The maximum useful energy 
(spent on the generation of new particles, nuclear reactions, heat- 
ing of a substance etc.) can be evaluated in the frame of the centre 
of inertia, for it is in this frame that the internal energy of the 
system is calculated (naturally, the motion of the system as a 
whole is “useless” from our point of view). Let us consider two 
particles / and 2 possessing equal energies (velocities) in a lab- 
oratory frame K, and moving toward each other. This frame will 
be for them a frame of the centre of inertia, and the total energy 
of particles in this frame provides the useful energy under dis- 
cussion. This total energy is equal to 27 + 2m, where T is the 
kinetic energy of each particle, and 2m the rest energy of particles 
(in the adopted time units c = 1). Let us find out how the same 
collision would look in the frame K’ in which particle / is at rest. 
This is just the picture of a particle impinging against a target. 
We shall consider the energy of particle 2 calculated in the frame 
where particle / is at rest. Let us make the appropriate recalcula- 
tion according to Eq. (5.43). (Do not forget that the same colli- 
sion is being considered in another reference fraine.) Designate 
the momentum and energy of particles / and 2 in the frame K by 
(p, &) and (—p, &). In the frame K’ the momentum and energy 
of particle / are equal to pj =0, &9=m. Fig. 5.2 illustrates the 
frames K and K’ and the velocities of particles in the frame K. 
The frame K’ is the proper frame of particle /, and according to 
Eq. (5.49) 


&,=f&i=m=ym (1) 
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(in our case [ = y since K’ is associated with particle /). It tol- 
lows immediately from Eq. (1) that 


& T+m 


ye om 


oA T 
~_—yn. 
m 





We shall! consider relativistic velocities of particles when B ~ 1; 
consequently 


= pF ~ &. 


In our case B = B & 1, whence the quantity TB entering in the 
transformation of energy is approximately equal to I. Now it is 








—_— >_> <_o 
b Ay 
14\=14,1 


0 0 tz" 


Fig. 5.2. Two points move in a laboratory frame at equal and oppositely di- 
rected velocities. Particle / is at rest in the frame K’. 


easy to obtain the formula for transformation of energy of par- 
ticle 2 on oe from the reference frame K to K’: 


2& is the energy realized in a head-on collision and accurate {fo 
a doubled rest energy of particles (which may be neglected at 
relativistic velocities). To realize this energy with particle / rest- 
ing, one needs the energy I times higher. This calculation shows 
the advantage that we gain when we use collision beams. 

The same result, however, can be obtained by simpler means. 
We shall demonstrate that the recalculation of energy according 
to Eq. (5.43) is equivalent to the calculation of energy according 
to the formula & = my into which the relativistic expression of 
the relative velocity of particles / and 2 is substituted. 

Thus, let the momentum of particle 2 in the frame K be equal 
to po = px = — myB and energy #2, = my. We shall calculate 
the energy of particle 2 in the frame K’ in which particle / is at 
rest. 

According to Eq. (5 43) 


&, =T (& — Bp,) =P (my + Bmyp) = my (1 + BB), 


since in our case y =T (in the frame K both particles have the 
same velocities). However, from Eq. (5.10) it follows (do not 
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forget that we have B < 0, see Fig. 5.2) that 
Py (1 + Ba) = y’; 
therefore 
&3= my’, 
where y’ is determined for the velocity of particle 2 in the frame 


K’ (i.e. the velocity of particle 2 relative to particle /). Calculate 
the relative velocity of particles / and 2. We have 

nn : | B —B 
B, = 8, = 1 B,B * 


In the frame K particle 2 has the velocity —f and particle / 
the velocity B, and consequently 


, 2B 
B= — THF 
This is just the relative velocity of particle 2. Find now y’: 
at Se ae ee 
1+ 6 


Consequently, 
& = my =my-y(1 +p) = We, 


for Bf = | and y =F. We have obtained the same result again, as 
il should be. 

§ 5.8. The conservation laws of relativistic mechanics. So far 
we have discussed the energy and momentum conservation laws 
for a material point, and now we have to dwell on the conserva- 
tion laws for a system of n material points. The problem of con- 
servation laws has two aspects. The first one involves the acquisi- 
tion of relativistic laws of conservation in terms of a given IFR. 
The second aspect pertains to the examination of the behaviour 
of the quantities remaining constant on transition from one iner- 
tial frame to another. Both of these problems are solved by ob- 
vious methods for a system of non-interacting particles; in the case 
of interacting particles these problems are very complicated. 

Let us begin with a system of n non-interacting particles. The 
motion equations and energy changes pertaining to the Ath par- 
ticle take the form (see Eqs. (5.27) and (5.31)) 


= (mA yeyt)) — Pi), (5.78) 


+ (mc2y*) = Pgh), (5.79) 
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where F) denotes a force acting on the ath particle (summation 
over & is not performed here!). 

If a particle non-interacting with any other particles is con- 
sidered, F‘*) = 0 and the momentum conservation law p(*) = 
= mi*y(*y(*) = const and the energy conservation law &® = 
== m(*)y(*)c2 = const follow directly from Eqs. (5.78) and (5.79). 
In fact, this circumstance manifests itself in the following rela- 


tion for an individual particle: P? = p? — &?/c? = const. In the 


> 
case of an individual particle which represents a closed system P? 
remains constant because p and &/c do individually. Note here 
once again that p and i&/c of an individual particle combine to 
form a 4-veotor. 

When we deal with a system of n non-interacting material par- 
ticles, the total momentum ) p™ and the total energy of a sys- 


ten »&” obviously remain constant because each individual 
addendum does. 
The transformation laws for a total momentum and total energy 


P=) p®, =F EH (5.80) 


in the case of transition from one IFR to another are evident: 
a suin of vector components is transformed as a vector component. 

The problem of conservation laws in a system of n interacting 
particles is far more complicated. In conventional classical me- 
chanics the interaction of particles in the case of conservative 
forces could be described by means of the system’s potential func- 
tion U = (r), rf, ..., rf), where r()(t) determines the position 
of the ath particle at the moment ¢, the positions of all n particles 
being considered at the same moment of time. In the final analysis 
a single moment of time can be chosen in classical mechanics due 
to the assumption of the infinite velocity of interaction propaga- 
tion. 

Since the interaction propagation velocity is finite in relativistic 
mechanics, the calculation of the force at a given point requires 
the positions occupied by all particles at some preceding moment 
to be known. Hence, it is clear that the form which the function U 
takes in a relativistic case is rather complicated. 

If one writes down the expression for the energy of a system of 
n objects in the form 


& = Y mbc2y (5.81) 
and for the total momentum 
P= Dm yep, (5.82) 


the following assertion is possible. The quantities P and i®/c do 
not form a 4-vector in contrast to what we had for an individual 
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particle. Besides, these quantities are not constant. The equation 
& = const does not hold because in classical mechanics the sys- 
tem’s total energy, including its potential energy, remains con- 
stant. Eq. (5.81) does not take into account the potential energy, 
and there is no simple way of incorporating it rigorously. The 
finite velocity of interaction propagation causes Eq. (5.82) to vary 
with time. In the final analysis this circumstance clarifies the par- 
adoxycal fact that the quantities € and P representing the sum 
of 4-vector components are not 4-vector components themselves. 
Indeed, in any reference frame where the sums (5.81) and (5.82) 
are composed, the summands are taken simultaneously in the 
sense of simultaneity of a given reference frame. When passing 
over to another inertial frame, one can find the values of moments 
and energies of individual particles and add them according to 
the rules of the 4-vector transformation. In a new frame, however, 
recalculated events will not be simultaneous. In order to find & 
and P in a new frame, one has to reduce these sums to simul- 
taneity in this new reference frame. It is this simultaneity reca!- 
culation that deprives the quantities P and & of properties of 
4-vector components. 

Interacting relativistic systems feature ten integrals of motion: 
the integral of energy, of momentum, of motion of the centre of 
inertia, of moment of momentum etc. The approximate appearance 
of these integrals is presented in the book [16], § 27, for example. 

As to the behaviour of integrals of motion in transition frorn 
one inertial frame to another, the energy and momentum form a 
4-vector, and the integrals of motion of the centre of inertia and 
of the moment of momentum form an antisymmetric 4-tensor, in 
the approximation (p() = u‘*)/c < 1), where the terms (f"))? are 
retained. Hence, it is clear that if all these integrals remain con- 
stant in one reference frame, they will be constant in any other 
reference frame. ; 

There is a case in which the conservation laws for momentuin 
and energy can be put down in a simple form: 


Y mOyey® = YL my’, (5.83) 
> m*c2y) — YF miee2y’), (5.84) 


These equations are valid only in the case of fast particles 
which interact weakly (or briefly). Eqs. (5.83) and (5.84) are not 
valid during an interaction, but they are quite suitable before 
and after it. In particular, they can be applied to ideal relativistic 
gas and also to “collisions of microparticles”. 

Here is an example showing how the conservation laws in a 
relativistic form are applied in the study of particle “collisions”. 
Let the particle of the mass mp strike on a resting particle of the 


178 Special Theory of Relativity 


mass m,. The total mass of particles generated as a result of tne 
“collision” (“reaction”) is equal to M. Reactions between par- 
ticles are governed not only by the momentum and energy conser- 
vation laws, but also by other specific laws of conservation. We 
shall not take them into consideration here. We shall be assum- 
ing, e.g. from experimental data, that a reaction may take place. 
The momentum and energy conservation laws make it possible to 
clear up the essential question: what is the minimum energy of 
a striking particle sufficient to bring about the reaction we are 
interested in? 

“Before” and “after” the reaction the momentum and energy 
conservation laws are complied with. In four-dimensional terms 
it means that the 4-vector energy-momentum of a system of par- 
ticles remains constant. We consider the situation “before” the 
collision in a laboratory reference frame. Before the collision the 
particles do not interact, and therefore the energy of the system 
of particles is equal to &) + m,c? and the momentum to po, where 
&, is the total energy of a striking particle and po is its momen- 
tum. 

The situation after the collision is convenient to consider in the 
frame of the centre of inertia. In accordance with the momentum 
conservation law this reference frame moves uniformly and recti- 
linearly relative to the laboratory one and therefore is also iner- 
tial (provided the laboratory frame is inertial). The minimum 
energy required for the accomplishment of the reaction is realized 
in the case when all particles generated after the reaction are 
resting in the frame of the centre of inertia (otherwise their total 
energy would be higher). Consequently, if the minimum energy 
is sought for, the energy of a system of particles generated after 
the collision is equal to Mc?, and the momentum is equal to zero 
(in the frame of the centre of inertia). In transitions from one 
IFR to another the square of a 4-vector energy-inomentum is an 
invariant. 

Write out 4-vectors of energy-momentum for a system of 


particles before and after the collision: P° (po, GJe+ mie), 
P (0, Mc). But the absolute values of these vectors are equal to 
P” = P? or M’c?=(&,/c + m,c)? — pj. Making use of the rela- 
tions &2/c? — p2= mic’, €,=T,+m,c’, we get 

M?c? = mc? + mic? + 2m, (T, + mc’). 


Hence, the minimum (“threshold”) value of the kinetic energy of 
a striking particle follows directly: 


To = so (M— mg — my) (M + 9 + my). 
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This iormula can be used in the treatment of various reactions. 
We shall quote three examples. 

The production of a m-meson in a collision of two nucleons: 
N+N—>N+N-+42n. The photoproduction of a m-meson at a 
nucleon: N + y—>N-+12. The production of the proton-antiproton 
pair (9 +f) on a bombardment of a proton (hydrogen) -contain- 
ing target with protons: p++ p>p+p+(p+/)). 

The interpretation of these reactions has to be based on the fact 
that the energy conservation law is always observed. In conse- 
quence, in these reactions the kinetic energy of initial particles 
turns (partially) into the rest energy of the particles produced. 
Speaking of the “production” of mass from kinetic energy is, of 
course, inaccurate. 


CHAPTER 6 


THE MAXWELL THEORY 
IN A RELATIVISTIC FORM 


The theory of relativity shows how to consider physical pheno- 
mena in any inertial frame of reference. The STR is based on 
complete equivalence of all inertial frames. Therefore, the basic 
equations describing physical phenomena in nature must be iden- 
tical in all inertial Teanies: of course, in each reference frame they 
are written in requisite variables, i.e. using length and time scales 
of a given reference frame. 

The basic system of equations describing electromagnetic phe- 
nomena was provided by Maxwell. It is remarkable that the sys- 
tem of Maxwell’s equations formulated fifty years prior to the 
advent of the special theory of relativity proved to be covariant 
with respect to the Lorentz transformation, i.e. it retains its ap- 
pearatice, with the accuracy of variables’ designations, under the 
Lorentz transformation. This signifies that the system of Max- 
well’s equations retains its appearance in any inertial frame of 
reference, and the principle of relativity holds automatically. 

Thus, the equations of electrodynamics require no modifications 
in terms of the STR, and it might seem that the theory of rela- 
tivity cannot introduce anything of importance. It is however far 
from being the case. 

First of all, previous to the advent of the theory of relativity it 
was not clear in which reference frames the system of Maxwell’s 
equations was valid. The theory of relativity indicated at once 
that this system of equations fitted any inertial frame of reference. 
Hence, it was natural to rewrite the system of Maxwell’s equa- 
tions in a four-dimensional form. Such a notation makes it pos- 
sible to find the transformation equations for the basic quantities 
of the theory when passing from one IFR to another. Using a 
four-dimensional notation, we shall also find the inseparable unity 
of charges and currents, electric and magnetic moments, electric 
and magnetic fields. Relationship between some other physical 
quantities will be discovered as well. Such a close relationship 
between definite physical quantities remained unnoticed until the 
emergence of a relativistic approach to electromagnetic pheno- 
mena. 
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As to the transformation of electric and magnetic field compo- 
nents on transition from one inertial frame to another, it can he 
carried out systematically only in terms of the theory of relativity. 
It is the theory of relativity that indicates a four-dimensionil 
antisymmetric tensor to be used to describe an electromagnetic field. 

§ 6.1. The three-dimensional system of Maxwell’s equations. 
A 4-potential and 4-current *. The Maxwell theory represents a 
macroscopic theory of electromagnetic field. In accordance with 
this theory an electromagnetic field in an arbitrary medium is 
described by the four vectors: the electric field strength E, mag- 
netic field strength H, electric field induction D and magnetic 
field induction B. 

In a uniform isotropic medium the number of field vectors need- 
ed to describe electromagnetic phenomena is reduced to two since 
field vectors turn out to be proportional to each other: 


D=ecE, B=uwl. (6.1) 


The constant coefficients e and wp are called a dielectric permit- 
tivity and magnetic permeability respectively. Vacuum is describ- 
ed as a uniform isotropic medium possessing the definite values 
of e and p which are customarily denoted by e9 and po and re- 
ferred to as an electric and a magnetic constant respectively. 

According to the Maxwell theory the field vectors obey the two 
principal equations: 


rot H =j{+D, rotH =j+eE, 


: a : b : 
rot E=— B; rot E=—wdi; (0) 2) 
the equations for an arbitrary medium are written on the left-hand 
side; the same equations for a uniform and isotropic medium are 
seen on the right-hand side. 

In the Maxwell theory the average values of an electric and a 
magnetic field (relative to “actual” microscopic fields) are defined 
by the vectors E and B. In the general case the vectors D and H 
are related to the average fields by the following equations: 


D=eE+P, B= w(H+M), (6.3) 


where two more vectors are introduced: the polarization vector P 
and magnetization vector M. 

The charge conservation law is assumed to be valid in the Max- 
well theory; in the case of a continuous charge distribution it is 


* All equations of electrodynamics are written in this chapter by means of 
the SI units. For the reader's convenience the principal formulae are also 
written out in the Gaussian system of units in Appendix II. 
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written down as the continuity equation: 
20 4 divj=0. (6.4) 


Here p is a charge density and j = pv a current density. 
From Eqs. (6.2) and (6.4) the two equations follow which are 
easily incorporated in (6.2) and (6.4): 


divD =p, div E=o/e, 
divB—o:. | divH=o. 


The force density for an electromagnetic field acting on free char- 
ges and currents is assumed to be equal to 


f =o {E + [vB}}. (6.6) 


It is referred to as the Lorentz force. From this expression it is 
seen once again that E and B are average macroscopic fields. 

The system of Maxwell’s equations can be written down not 
only by means of field vectors but also via the scalar and vector 
potentials @ and A. We shall consider the case of a uniform and 
isotropic medium and combine the potentials @ and A with the 
fields E and B by means of the relations 


E=—Vo—A, B=rotA. (6.7) 


Having substituted these expressions into Eq. (6.2b) and hav- 
ing additionally imposed the Lorentz condition 


(b) (6.5) 


divA+—o=0 (6.8) 


on potentials (one can show this condition to be easily met), we 
se the equations which hold in the case of the potentials «p 
and A: 





Oe=—+t », OA=—ypj, (6.9) 
where 
1 @& 1 
=A— ——, = : 
0 on v An (6.10) 


In Eqs. (6.9) it is assumed that p = p(r, t), j =j(r, f), ie. the 
current and charge densities are the assigned functions of coordi- 
nates and time. Eqs. (6.7) and (6.9) are equivalent to (6.2). 
Next, we shall impart a four-dimensional meaning to the quan- 
tities involved in the Maxwell equations and rewrite the Maxwell 
«quations themselves in a four-dimensional form. However, we 
shall proceed gradually, and the Maxwell equations (6.2) and 
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(6.4) will be written in a four-dimensional form only in § 6.7. Now 
we begin with composing four-dimensional quantities from the 
potentials @, A and the densities p, pv. 

Eqs. (6.9) are differential equations of the same kind, the 
d’Alembertian equations. Accordingly, they can be written at once 
as a single four-dimensional equation, provided that the two 4-vec- 


tors are introduced: the 4-potential vector ® and the 4-current 


density vector 3. 
For some time we shall be simultaneously writing out defini- 
tions and relations in the real and imaginary forms, just the way 


we did it in mechanics. So, let us define the 4-potential vector cis 
as follows: 


>+(M, @ OD @, 
ate a AL eek © 





ab o! @ = 
gc A, A, Ase 
(6 II) 


and 4-current vector 
> {* s' s? 53 


rf" So $3 =} (a) 
Shes we bop: Ne ae 
Ix Ty Jz tcp ce Ix Iy Ie 


Recall the definition of the 4-radius vector: 
> (X, Xq Xy Xs x! x? | 
Rt? y 2 a (a) | ae yz) (b) 


The 4-vector components ®, 3, R are written in two parallel lines, 
in a conventional and a symmetric notation respectively; the ne- 
cessary component values are immediately found from the com- 
parison of these lines. Having determined the 4-potential and 
4-current density, one can rewrite Eq. (6.9) for vacuum (i.e. with 
& = €9 and p = po, where c? = 1/(eouo)) as a unified formula: 


DO Dy = — HoSe (k= 1, 23 3, 4), (a) | Oo = — ps" 
(k=0, 1, 2, 3). (b) (6.13) 
Obviously, the three equations of (6.13a) at k = 1, 2, 3 coincide 
with the three equations of (6.9) in the case of vacuum. At & = 
Eq. (6.13a) gives O= p=—p icp, and since c? = I/(eoy0), we 


obtain Eq. (6.9) again. 

We suggest that the reader make sure that Eqs. (6.13b) also 
coincide with Eqs. (6.9) in vacuum. 

In the case of vacuum the Lorentz condition (Eq. (6.8)) and 
the charge conservation law (Eq. (6.4)) can be written via the 





\. (b) (6.12) 
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4-divergence of the vectors ® and s. Indeed, for example, 














OD, _ 9D) | 8D, | IDs | 9D, _ 
divO= Zr OX; Oxy or OX: ai Ox, Tagan OX, 
dA, ay dA, Alig). ot. 
Ox aa Oz 0 (ict) =divA@, 
_> Os a J] iy @ (icq) do 
divs=3z-=F ae ; i tsar duchy Avi +t Gr: 


Consequently, the Lorentz condition and the charge conserva- 


tion law in vacuum have the form div ® = 0, divs = 0, when 
presented in a four-dimensional notation. Surely, the same results 
are obtained on the basis of the usual three-dimensional approach. 
The conclusions just drawn are very significant. As it is shown 
in Appendix I, § 4, the 4-divergence is the Lorentz transformation 
invariant. As to Eqs. (6.13), these 4-vector relations are valid in 
any inertial frame of reference in vacuum. Thus, the equations for 
potentials, the Lorentz condition and the charge conservation law 
can be so rewritten that it would become evident at once that they 
retain their appearance in any inertial frame of reference. The 
covariant notation of equations for potentials in the case of a 
refractive medium (e 3 e9 and p = po) will be discussed below 
(see §§ 6.14, 6.15). 

§ 6.2. The transformation of a 4-potential and 4-current. The 


very fact that we have managed to compose the 4-vectors ® ands 
makes it possible to write down immediately the transformation 
equations for components of these vectors. We shall write these 
equations here in both the real and the imaginary form (cf. Eq. 
(4.10a, b)): 
®, = 1 (@j — BO), O.=—O;, D—D3 Oy =—F (M4, + iBa)); 
(6.14a) 
o=r(o*+Bo"), o'=r(o'+Bo"), =o, o=0"; 
(6.14b) 
s, =F (s; — iBs{), s,=s,, s,=Sy S,=T(s,+iBsi); (6.15a) 
=T (s+ Bs!’), st =P (s+ Bs”), s?=s?, s3=s%. (6.15b) 
Let us look more carefully into the current density transforma- 
tion. A 4-current comprises a current density and a charge den- 
sity. It is quite natural that a current and a charge density are 
combined into a single 4-vector. Dealing with reference frames 
moving relative to one another, one should bear in mind that a 


charge may be at rest only in one (“proper”) reference frame. In 
all other IFRs the charge moves, being in terms of these frames 
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not only a charge but also a current. We see, therefore, how easy 
is the transition from a motionless charge (electrostatics) to a 
moving one (current): it is just a transition from the charge’s 
proper reference frame to any other IFR. When a current origi- 
nates due to charges displaced together with a moving medium or 
objects, it is called a convection current. It is just a convection 
current that turns up on transition from a proper frame to an 
arbitrary IFR. 

The formula f = pv contains the density of charges moving at 
the velocity v. Otherwise, misunderstandings may arise. For 
example, a current flows in metals, even though p = 0. Indeed, 
the total charge density in metals comprises those of ions and free 
electrons and is equal to zero: p = p++ p- = 0. But certainly, 
a current can flow provided there is a regular motion of electrons: 
j = p+v+ + p-v_ = p_v_, because the velocity of a regular motion 
of ions is equal to zero. 

From Eqs. (6.15) we obtain directly a convection current origi- 
nating on transition from the charge’s “proper” frame. Thus, let 
the frarne K’ be characterized by a_given charge density p’ and 
the absence of a current (j’ = 0). Consequently, in the frame K’ 


the 4-current density has the components s’ (0, 0, 0, icp’ = icpo), 
i.e. S| = 5, = 5, =0, s,;=Zicp). Then in accordance with Eq. (6.15a), 
for example, in the frame K 

s, =I (— iBicp,) =T Vp, so—=s3=0, s,—=Ticpy. (6.16) 
In the developed form the last relation of Eq. (6.16) has the fol- 
lowing appearance: 

ee eae icpo 
hed ay | wae 7277 

This way we obtain the charge density transformation law on 
transition from the charge’s “proper” frame, in which charges are 
at rest, to the frame relative to which the charges move at the 
velocity V: 


__fo 
= vat Va (6.17) 
The first relation of (6.16) yields a current density 
ee de ees a 
Si=Je Va vae eV. (6.18) 


As we have mentioned, a current associated with the motion of 
a charged medium or charged object is referred to as a convec- 
tion current. 

The meaning of Eq. (6.18) is very simple. The velocity of a 
charge resting in the frame K’ is equal to V with respect to the 
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frame K (that 1s the velocity of the frame K’; the same follows 
from the relativistic velocity transformation formula (3.27) ). There- 
fore, Eq. (6.18) represents a convection current. As to the charge 
density transformation (see Eq. (6. 17)), it is associated with the 
volume change (since a density is a charge of a volume unit). 
Since a volume transforms according to the law 


dV =dV, V1 — V*/c?, 


and we consider the same physical volume containing the same 
nae de, then 


ee ed = FS SS oe ee 
Poy,’ ane Oty Vi-vae dy vl—Vie pol" 


Of course, the total charge in a given volume remains constaut 
in any reference frame: 


Pod, =p dr. (6.19) 
Eq. (6.19) expresses the invariance of a charge confined in a 


given volume. Using Eq. (6.17), the 4-vector 5 can be expressed 
otherwise. Let us consider a small volume element of a moving 
medium in the frame K. Then in the reference frame K°® co-moving 
with this element the velocity of the element v = 0 and p = pr. 
In the frame K. the density p = poy and, consequently, j=pv= 


== pyyv. Then $ (poy2, polcy) = ooV, where V is the 4- -velocity of 
an element of a medium. Thus, 
> > 
S=PoV, S$; = Poll. (6.20) 
For s? we have the following form: 


> 
Ss? == py? (vu? — c?) = — p?/c?. (6.21) 


> 
The 4-vector s is a time-like vector because a charge velocity v 
is always less than c. 

If in the frame K’ there is an uncharged conductor through 
which a current flows, i.e. 


> . : . e 
So(ixor fyo. Jeo Po =), (6.22) 


in the frame K’, a certain charge density p is observed in K. In- 
deed, according to Eq. (6.15), 


S=Tho, SoH, S3s=Jo, Sa=icp=TiBjy. (6.20 


The first three formulae of (6.23) define the current magnitude 
in the frame K, the last one defines the charge density in the 
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frame K: - 
p=P jx (6.24) 


Consequently, an observer in K will find the density p, even 
though the charge density in the frame K’ is equal to zero. This 
result can easily be interpreted in geometric terms. Let the con- 
ductor be at rest in the frame K; the ions of the conductor are 
motionless and the electrons move at some average velocity v. In 


i 





Fig. 6.1. The Minkowski diagram illustrating the emergence of a charge density 
in the frame K’ in a current conductor which has a zero charge density in the 
frame K. The total charge density caused by the ions and electrons in a conduc- 
tor is equal to zero, Current is generated due to the motion of the electrons, 
with the ions being motionless The world lines of the ions are depicted by dot- 
ted lines, and the world lines of the electrons by slanted continuous lines. Be- 
sides the reference frame K (with the x, t axes), also shown is the reference 
frame K’ (with the x’, 1 axes); a scale hyperbola cutting ‘unitary segments on 
the x and x’ axes is also drawn in the figure. Inasmuch as a charge density 
must be determined simultaneously at all points, it has to be determined at all 
points of the object on the x or x axis respectively. It is seen that if a unitary 
segment in the frame K contains an equal number of ions and electrons, a uni- 
tary segment in the frame K’ will contain more ions than electrons. This fact 
signifies the emergence of a positive charge densily in the frame K’ 


the frame K the world lines of the ions are straight lines parallel 
to the t axis while the world lines of the electrons are straig'it 
lines forming a certain angle @ = arctan(v/c) with the 1 axis. 
Fig. 6.1 shows the reference frames K(x, t) and K’(x’, 1’), the 
world lines of ions (dotted lines), and the world lines of electrons 
(thin continuous lines inclined to the t axis at an angle @). Inas- 
much as metal is neutral on the average, each segment of the con- 
ductor must emit an equal number of world lines of ions and 
electrons. The charge density should be measured simultaneously 
in each reference frame. In the frame K it is determined by the 
number of world lines of ions and electrons crossing a unit of 
length in this frame. For example, the charge density is defined 
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by the number o! world lines of ions (taken with tne sign ~-++”) 
and the number of world lines of electrons (taken with the sign 
“_") going through the segment OA. A scale hyperbola cuts the 
unit segments on the x and x’ axes. It is these unit segmentsthat 
the charges must be related to. But in the frame K’ the charge 
densify must be calculated for the whole conductor simultaneously. 
In the frame K’ simultaneous events are located on straight lines 
parallel to the x’ axis, and on the x’ axis itself, in particular. It 
is seen, however, from Fig. 6.1 that the unit segment OA’ accom- 
modates more positive charges than negative ones. Accordingly, 
the conductor will turn up to be charged positively in the frame 
kK’ although it is neutral in the frame K. Surely, if one considers 
a closed-type conductor in the frame K’, its total charge will 
remain equal to zero, but an electric dipole moment, not found 
in the frame K, will be observed in the frame K’ (see § 6.9 and 
Fig. 6.4). 

§ 6.3. An electromagnetic field tensor. In electrodynamics the 
electric field strength E and magnetic induction B are convenient- 
» expressed via the vector and scalar potentials A and @ as fol- 
ows *: 


B=rotA, E=-— gradg—dA/do. (6.25) 


Let us rewrite these equations using 4-potential components; for 
the present, we shall write out the relations of the complex 
4-space: 

0A 


dA, 00, 00, 








= 2. a ee a 
B, =B,= ay = Oe on (6.26) 
== F a — 20 _ 9Ar eM OM 5, g, ( OOr IO 
£,=£,= Ox ot~«SESC“‘«éi Ox, (© "CN Oy, ox, ). 
(6.27) 


The last terms of Eqs. (6.26) and (6.27) are written on the 
basis of the definition of 4-potential components. Similarly, using 


the ® components, one can write down the remaining components 
of the vectors E and B as well. We shall obtain equations similar 
to Eqs. (6.26) and (6.27) from which it follows that all compo- 
nents of the vectors E and B can be expressed via certain combi- 


nations of derivatives of the 4-vector ® components with respect 


* In previous chapters we utilized the Ictter B to denote the ratio of the ve- 
locity of a coordinate system to that of light Here we have to introduce a mag- 
netic induction vector B and its projections 8,, B,, Bz To eliminate blunders, 
we shall not be using the designation B = V/e in this chapter, apart from the 
cases when misunderstandings are ruled out 
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to lour-dimensional coordinates. These combinations form the 
antisymmetric 4-tensor of the second rank *: 


ao, 30,\ 
Fuse (i, k= 1, 2, 3, 4). (6.28) 


Ox, Ox, 


Prior to discussing mathematical features and the meaning of 
Eq. (6.28) we should examine the same transition in the real 
4-space. As we have indicated, one has to differentiate between 
covariant and contravariant components of vectors and tensors 
in this case. The electromagnetic field tensor (Eq. (6.28)) is ex- 


pediently written in covariant components. Then, if the vector ® 
has the contravariant components (@°, A), its covariant compo- 
nents will be (@°, —A). The differentiation with respect to con- 
travariant coordinates leads to covariant components again (see 
Appendix I, § 8). Thus, Eq. (6.26) only changes the sign while 
Eq. (6.27) is written as 











_ pp, 20 , 9% 8D, j 
E,=Ey=—c St +c Sc (4 art) (6.27’) 


taking into account that p=c@, Ar, = —QM,, t= ¥*/c, x = x!, 
Thus, in the case of the real 4-space we introduce the covariant 
antisymmetric 4-tensor of the second rank 


a® aod 
Fume( & aa +), 
Ox Ox 








(6.28’) 


coinciding in its appearance with Eq. (6.28). The purpose of the 
introduction of Eq. (6.28) or (6.28’) is very remarkable. The two 
Maxwellian field vectors E and B can be expressed in the 4-space 
uniquely through a certain combination of space-time derivatives 


of the 4-vector potential ®. The behaviour of the quantities F,, 
on transition from one IFR to another is most significant for us. 
Their transformation, however, is very easy to find: the quanti- 
ties F,. form a tensor since one can readily make sure (see Ap- 
pendix I, § 3) that derivatives of 4-vector components with respect 
to coordinates transform in accordance with the tensor transfor- 
mation rule. If the indices i and & in Eqs. (6.28) and (6.28’) ac- 
quire independently all values from 1 to 4 (and from 0 to 3 re- 
spectively), we obtain 16 values of F,. (four of which are equal 
to zero) expressed through components of E and B. Let us write 


* Just as in most books on relativistic electrodynamics, we use the subindl- 
ces f and & despite the fact that the imaginary unity, also designated by i, keeps 
recurring alongside over and over again. We hope that this will not lead to a 
misunderstanding. - 
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down these components in the form of matrices: 
0 cB, ~—cB, —iE, 

—cB, 0 cB, —iE, 

cB, —cB, 0 —JE, 

iE, iE, iE, 0 (6.29) 
0 cE, cE, cE, 
— cE, 0 —cB, cB, 
Fu=\| ce cB, 0 — eB, J’) 

—cE, —cB, cB, 0 


We see that the components of an electric field strength and 
magnetic induction are the components of one 4-tensor of an elec- 
tromagnetic field. As usual, in the designation F,, the first sub- 
index i denotes the line and the second & the column of the 
matrix Fie. 

Frequently the tensor (6.29a) is abbreviated as § = (cB, —iE) * 
and the tensor (6.29b) as § =(E, cB), assuming the components 
of the vectors E and B are arranged as in (6.29a) and (6.29b) 
respectively. 

We have obtained the result which is dissimilar to the usual 
three-dimensional case. The Maxwell theory deals customarily 
with vector fields. In fact, the vectors E and B behave as 3-vectors 
as far as the transformation of a coordinate system, that is the 
rotation of coordinate axes, is concerned. As soon as we pass 
over to reference frames moving relative to one another, the sit- 
uation changes drastically. In 4-space E and B are no long:r 
vectors, not even four-dimensional ones. Although the vectors E 
and B are expressed via components of a four-dimensional poten- 
tial, there are no values to be added to the three-dimensional 
vectors E and B in order to make them become 4-vectors. In 
4-space an electromagnetic field is a single quantity of a more 
coinplicated mathematical nature than a 4-vector. The fields E and 
B have merged into a single 4-tensor which is referred to as an 
electromagnetic field tensor. 

The emergence of a single 4-tensor instead of two three-dimen- 
sional vectors describing an electromagnetic field has a clear-cut 
physical meaning. Electric and magnetic fields are intertwined so 
inseparably that the “appearance” or “disappearance” of one of 
the fields is determined by the choice of a reference frame. For 
example, a “pure” clectrie field generated by a charge occurs 
under very specific circumstances when the charge is considered 


* The Gothic letter § denotes 4-force components in Chapter 5 In this chap- 
ter the same letter ts tsed exclusively for the designation of a tensor. 
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in the frame in which it is at rest. In any other inertial frame, 
however, this charge moves and consequently generates an elec- 
tric current producing a magnetic field. On the other hand, we 
have seen that although in a certain reference frame a current- 
carrying conductor appears neutral, in other inertial frames of 
reference it appears charged, and consequently, an electric field is 
due to occur in these frames. 

Thus, it is sufficient, for example, to have only an electric fieid 
in the frame K for a magnetic field to appear in any other frame 
K’. lf in the frame K there is only a magnetic field, both magnetic 
and electric fields will appear in any other frame K’. Had we at- 
tempted to treat the fields E and B as vectors, that physical fact 
could not have been expressed in mathematical terms. As we have 
mentioned, there are no values to make the three-dimensional vec- 
tors of an electromagnetic field become 4-vectors. Moreover, ha: 
each of the vectors E and B been contained in “its respective” 
4-vector, the Lorentz transformation would have necessitated each 
of these vectors in a “new” frame to be expressed through com- 
ponents of “its respective” vector in an “old” frame. In such a 
way, the vectors E and B would turn out to be unrelated. Practice, 
however, indicates an intimate connection between electric and 
magnetic fields, that is between the vectors E and B. 

{wo three-dimensional vectors possess six independent compo- 
nents. An antisymmetric 4-tensor of the second rank possesses 
exactly six independent components. We have found (see Eq. 
(6.29a, b)) that the fields E and B form an antisymmetric 4-ten- 
sor, an electromagnetic field tensor. Inasmuch as any component 
of a tensor in a new reference frame is a linear combination of all 
components of that tensor in an old reference frame, a transition 
from one reference frame to another may result in the appearance 
of an electric field due to a magnetic field observed in another 
frame, and vice versa. In a certain sense, an electromagnetic field 
is a closed formation: if some inertial frame has no electric or 
magnetic fields, an electromagnetic field will not appear in any 
other inertial frame. We shall deal with the transformation of 
electromagnetic field components in the next section while here 
we shall examine briefly an electromagnetic field in matter. 

To describe a field in matter, one has to introduce, in addition 
to the average fields E and B, two more vectors. These can be 
either the electric induction vector D and the magnetic field 
strength H, or the electric polarization vector P and the magne- 
tization vector M. These four vectors are interrelated as follows: 


D=eE+P, H=B/y—M. (6.30) 


The vectors H and D form a special tensor whose components 
are customarily denoted by fix while the tensor itself is abbreviated 
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as f = (H, —icD). This tensor is derived from (6.29a) by substi- 
tuting H components for cB components and —icD components 
for —iE ones. Here is the matrix of this tensor components: 


0 H, —H, —icD, 
—H, 0 H, -—icD, 
ln= H, —H, .0 —icD, Os} 


icD, icD,  icD, 0 


In addition to the tensor f, it is also useful to introduce a tensor 
of electric and magnetic moments of matter whose definition fol- 
lows easily from Eq. (6.30): 


Ny, af Fir— (6.32) 
It is abbreviated as follows: Dt = (M, icP). Written in full it is 


0 Mz, —M, icP, 
—M, 0 M, icP, a3 
ma=| oM, —M, 0. iP, 633) 


—icP, —icP, —icP, 0 


It is the very fact of the origination of tensors (6.29), (6.31) and 
(6.33) that points to a close pairwise connection between the 
quantities E, B; H, D and M, P. We have written out here the 
tensors fiz and mm., only for a 4-complex space; the corresponding 
expressions for a real 4-space can be easily derived by the reader 
himself. 

§ 6.4. The transformation of electric and magnetic field compo- 
nents. The four-dimensional approach is particularly convenient 
because as soon as the mathematica! nature of one or another 
physica! quantity is established (a scalar, 4-vector, 4-tensor), the 
problem of its transformation on transition from one IFR to 
another is solved automatically. In mechanics we dealt with 
4-vectors. As we have established, components of the fields E 
and B, H and D, and M and P are components of tensors 
(6.29a, b), (6.31) and (6.33) respectively. 

Consequently, components of three-dimensional vectors are 
transformed according to the role of tensor component transfor- 
ination. For example, the components F,x in a 4-complex space 
are transformed as follows: 


bee ee , 
P ig = Gig MeyF mir 


(6.34) 


where aim are components of the Lorentz transformation matrix 
(2.41a) while components F,, are defined by Eq. (6.29a). In order 
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to transform components of matrix (6.29b), one has to make use 
of the Lorentz transformation matrix in the form (2.41b). 

Here we ought to make a purely methodological remark. Lec- 
turers often avoid using tensors trying not to complicate a lecture 
course. Indeed, explaining the meaning of tensors and their prop- 
erties in the course of half an hour is a difficult task. One can- 
not, however, disregard the fact that an electromagnetic field is a 
tensor. The old question arises again: “shall we call a cat a cat” 
from the very beginning? Of course, it is not so much the matter 
of a name, as the transformation equation (6.34). Apparently this 
equation can and should be obtained in the simplest way possible. 
For example, it is easily derived as follows: we see from Eq. (6.28) 
that the quantities F,, are linear combinations of 4-vector compo- 
nent derivatives with respect to 4-coordinates; the transformation 
of vector component derivatives due to the coordinate transforma- 
tion follows from the simple analysis (see Appendix I, § 3). 

To memorize the transformation rules for tensor components, 
one should keep in mind that they transform as a product of cor- 
responding vector components. One way or another, we get Eq. 
(6.34). And here it is the right time to call a tensor a tensor, hav- 
ing disclosed, rather one-sidedly, of course, the meaning of tensor 
components by means of vector component derivatives with re- 
spect to coordinates. 

We shall give an example of how the field transformation equa- 
tion can be obtained for the case Bz = Fi2/c. According to Eq. 
(6.34) the transformation equation for F,. has the following form: 


Fg =O, Coy F rat (6.35) 


Recall that the summation is carried out here over the two in- 
dependent pairs of the indices m and /, each of which runs from | 
to 4. This way, Eq. (6.35) involves the sum of sixteen terms, each 
being a product of two a.e and one of F,»x components. We urge 
the readers who come across such equations for the first time to 
write out (once in a lifetime) all sixteen terms. Here is the easiest 
way to do this. First, we develop the sum with m taking the val- 
ues |, 2, 3, 4. As before, the index / denotes summation. This way 
we obtain a sum consisting of four terms in which the index im 
is eliminated. Then we perform the summation over / in each of 
these four terms. As a result all sixteen terins will be written out. 
Then one should substitute a., from the Lorentz matrix (see Eq. 
(2.41a)) and components F,, from Eq. (6.29a) into these terms. 
One can see at once that most terms of the sum (6.35) are equal 
to zero. Because of this the summation in Eq. (6.35) can be much 
simpler. Indeed, the quantities aim, with m running from 1 to 4, 
constitute the elements of the first line of the Lorentz matrix (see 
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Eq. (2.41a)) while the second line is made up of the quantities ay 
with / = 1, 2, 3, 4. But the first line of the matrix has only two 
elements, a1; and a4, which differ from zero. Consequently, one 
must consider only the values of m equal to | and 4. The second 
line has only one element differing from zero, ag = 1. Conse- 
quently, one has to take only / = 2 and to rewrite Eq. (6.35) as 


— _—s f —_ f pa .o/ 7 
F jg = CB, = Oy GF ine = Um ime = iF 2 + Oy P oe = 


rv 
cB, +— £, 
V1 — V2/c2 * 


Intercomparing the second and the last equation in this chain of 
equations and dividing them by c, we get 


=I {eB, —i— (iE,) b= 


’ V 

By ta Ey v 
“year (8: +— E;). 
In much the same way we obtain the transformation equations 
for the other components: 

E,=E,, E,=V (E,+ VB), E,=T (E, — VB); 

U 4 V 7 V 
B,=Bi, B,=0(8,—-—E,), B, =I(B+—4£;). 


c 


B 


(6.36) 


Let us write out the transformation equations for D and H to be 

used later on: 

/ tf V / / V tA 

D.=D,, D,=0(D,+aH)), D,=1(D,-H}); eas 
H,=H, H,=T(H,— VD), H,=T (H,+ VD)). 
Exactly the same Eqs. (6.36) and (6.37) are of course obtained 
in the real 4-space. We shall not mention this space any more be- 
cause hercinafter we shall only use the final equations and they 
are identical; moreover, substantial difference is noticed only in 
transition from Eq. (6.27) to (6.27’). What follows is a simple 
matter. 

It is seen from Eq. (6.36) that all field vectors change their 
magnitude and direction on transition from one inertial frame of 
reference K’ to another K. Only “longitudinal components” remain 
invariable, i.e. components along the direction of the relative mo- 
tion (along the x axis). 

Let us expand the electric and magnetic fields E and B into 
the components parallel and perpendicular to the motion direction 
(the unit vectors i, j, & are directed along the x, y, 2 axes re- 
spectively), e.g. 


E,=£,i, E,=EJ+E,-. 
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Having noted that the velocity vector V of the coordinate system 
K’ has the components (V, 0, 0), we obtain 


ij ok 
[VB =| V 0 0 |= —jVBL+RVB = V (—jB, + kB}), 
By, BY By 


[VE}=V(—jE,+ RE}). 
Then Eq. (6.36) can be rewritten in the vector form: 
E, =(E’ —[VB’]),, E,=T(£,— [VB’]) |; 


; : : ? (6.38) 
B,=(B ++ (VE 5 B,=T(B, + IVE 1): 

It is appropriate perhaps to recall here that all expressions of the 
type [VA]j are equal to zero while expressions of the type [VA]. 
coincide with the vector product itself for any A. The reverse trans- 
formation equations are obtained by substitution of unprimed 
quantities for primed ones and vice versa, and by changing the 
sign of V to the opposite: 


E,=(E+[VB]),, EF’. =l(E+[VB)),; 


B\=(B—rIVE]),, B,=T(B——rlVEl) . (6.39) 


In the case of non-relativistic velocities [ ~ 1, and we obtain 
from Eq. (6.38) 


E=E'+([BV], B=B’—-,|E’V]. (6.40) 


The following designations are used: E=E,+E, and B= 
= B+ B,. The equations of reverse transformation from K to K’ 
are obtained as usual by substitution of unprimed quantities for 
primed ones and vice versa with a simultaneous change of sign 
of V: 


E’=E+|VB), B’=B—-,|VE]. (6.41) 


In conclusion let us write out the transformation equations for 
D and H. One may not compute anything; it is sufficient to recall 
that we have obtained the transformation equations for the com- 
ponents of the tensor § = (cB, —iE), and now we are interested 
in analogous equations for the tensor f =(H, —icD). Instead of 


q* 
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Eq. (6.39) we obtain the following expressions for the correspond- 
ing components: 


D,=(D+4(VH]), D,="(D+4IVH)) ; 
H’ =(H —|VD)),, HH’, = (H—[VD)),. 


(6.42) 


In the case of non-relativistic velocities, when [ ~ 1, Eq. (6.41) 
turns into 


=D+-5(VH), H’=H—[VD). (6.43) 


Suppose that in the frame K’ the magnetic field B’ = 0. Then 
in the frame K the relationship between E and B becomes very 
simple. First of all, notice that [VE]=[VE.] since [VE,] = 0. 
Froin Eq. (6.38) we obtain 


E=E4+0E, 
B=! VE)=4[VE,J=—[V. eo (6.44) 
= [V. E+E, J=4IVE). 


Similarly, if in the frame Kk’ the field E’ is equal to zero or in the 
frame K the field E is equal to zero, then 


—{VB), E’=[VB’. (6.45) 


In both cases and in any inertial frame the fields turn out to be 
mutually perpendicular. It follows from both the relativistic equa- 
tions (6.38) and the approximate equations (6.41) for low veloci- 
ties that if in one of the frames (say, K) an electric or a magnetic 
field is equal to zero, electric and magnetic fields in all other 
inertial frames of reference are perpendicular to each other. The 
same result can be obtained by employing the Lorentz transforma- 
tion invariants (see § 6.5). 

If the felds E’ and B’ are mutually perpendicular in a reference 
frame K’, there exists a reference frame K in which one of the 
fields disappears. It will be shown in § 6.5 that the expression 
c?B? — E? remains invariant under the Lorentz transformation. 
Consequently, if the condition c?B’? — E’? < 0 is satisfied in the 
frame K’, one can obtain a purely electric field through the ap- 
propriate choice of a reference frame, while in the case of 
c*?B’? — E’? > 0 one gets a purely magnetic field. We shall show 
how to find the velocity V of the reference frame K. Suppose this 
velocity is perpendicular to B’ in the ‘case of c?B’2 — E’”? < 0 and 
to E’ in the case of c?B’2— E’2 > 0. Then B, = 0 in the former 
case and Ey = 0 in the latter case. Now one has to ensure that 
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B, = 0 in the former case; in order to do this, the following con- 
dition should be met (see Eq. (6.38)): 


BY, + (1/c?)(VE’}, =0. 


Multiplying both sides of this expression by E’ vectorwise and 
taking into account the relations |VE’), = [VE], = [VE], 


[E’ |VE‘]], =VE”, B’, = B’, we obtain the reference frame velo- 
city V 
V = — (c/E”) [E’B’). (6.46) 


In much the same manner, we obtain in the other case 
V = (1/B’) [E’B’). (6.47) 


One can always find such an inertial frame of reference in which 
an electric and a magnetic field are parallel to each other at a 
given point (see, however, the com- 
ment on light waves at the end of KK EB 
§ 65). Obviously, provided there 
exists one such frame, there should 
be an infinite number of frames pos- 
sessing the same property. In fact, 
in any inertial frame of reference 
K’ moving rectilinearly and uniformly 
relative ok in the direction ae 
ing with the common direction of E - 
and B, the fields £’ and B’ will Fig. 62. A transition to the ret 
remain parallel since the field com- electric and a magnelic field turn 
ponents oriented along the motion out to be parallel. 
direction do not vary. 

In order to find at least one frame in which the fields are paral- 
lel, we shall proceed as follows. Suppose that the fields are paral- 
lel in the frame K, i.e. [EB]= 0. Direct the velocity of the frame 
K’ (in which the fields E’ and B’ are not parallel any more) along 
the perpendicular to the fields E and B; assume that the x, x’ axis 
is directed along the velocity V (see Fig. 6.2). Then Ex = By =0 
and, since the vector cross product is equal to zero, £,B. — E:By= 
= 0. Substituting into this equation the components of E and B, 
expressed via the components of E’ and B’ according to Eq. (6.36), 
we arrive at the following equation: 


P(E, + VB) (8, +6) = (E, —vaiyr(B,—-4#). 





The frame velocity V can be determined from this equation using 
the given fields E’ and B’. Taking into account that according to 
Eq. (6.36) £,=—B;=0, we can immediately find the direction 
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of the velocity V relative to E’ and B’. Indeed, [E’B’]=i(E,B; — 
— E2B;) and V = V-i, so that solving the foregoing equation, one 
can write 
Vict [E’B’) 

TEV yr — — CBI EF (6.48) 
Thus, from the given vectors E’ and B’ in the frame K’ one can 
find the frame K in which E and B are parallel. The velocity di- 
rection of this frame coincides with that of [E’B’], while the 
velocity magnitude is one of the roots of the quadratic equation 
(6.48). Surely, from the two roots of Eq. (6.48) one should choose 
the one for which V < c. The case E’B’ = 0 was examined above. 
one cannot obtain parallel fields here, but it is possible to gat 
either a purely magnetic or a purely electric field. 

§ 6.5. The electromagnetic field invariants. Although an elec- 
tric field strength E and a magnetic field induction B vary under 
the Lorentz transformation, there are some combinations of these 
fields remaining invariable under it. These quantities are invari- 
ants of antisymmetric 4-tensors of the second rank. We make use 
of two such invariants (see Appendix : § 6): 


Lo Fiy b=FuF F enim™ iF im 


Recalling the definitions of the tensorsFi, and Fiz 
S(cB, —iE), S°(—iE, cB) 


and taking into account that the first invariant is the sum of the 
components Fie squared and the second invariant represents the 
pairwise products of the corresponding components of the tensors 
Fy, and Fix, we can write at once /, = 2(c?B*— E*), /, = 
= — 2ic(BE). 

Omitting immaterial constant factors, one can claim that an 
electromagnetic field possesses two invariants (we shall not write 
out the invariants of the tensor § and the combined invariants of 
§ and f since they will not be needed): 


1=c?B?—E, I,= BE. 


From the existence of these two invariants follow the results, 
some of which have been mentioned before. If in some IFR the 
fields E and B are mutually orthogonal ([EB]= 0), they are also 
orthogonal! in any other inertial frame of reference. If in some 
reference frame E = cB, this relationship holds in al! inertial 
frames of reference. 

It should be noted here that both invariants are equal! to zero 
for a light wave in vacuo. These properties, ie. B 1 E and cB=E. 
are maintained in any IFR. 
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It is clear that if J; =O and J; #0, one can always find a 
reference frame in which either E = 0 or B= 0 (depending on 
the sign of /,), ie. pass over to either a purely magnetic or a 
purely electric field. Conversely, if either E or B is equal to zero 
in some frame, these fields will be mutually orthogonal in all other 
inertial frames. Note that the quantity BE is not a “real” scalar 
since it changes sign on transition from the left coordinate system 
to the right one and vice versa, while the quantity (BE)? is a real 
scalar. 

§ 6.6. The Lorentz force. Now let us consider the forces acting 
on electric charges in an electromagnetic field. To avoid confu- 
sion, we shall confine our presentation to spatial distribution of 
charges *. In a co-moving reference frame K’ in which a con- 
sidered space element rests together with a charge, this charge 
experiences a force exerted by an electric field (a magnetic field 
does not act on a charge at rest). The force acting on a charge 
contained in a unit volume is referred to as a force density. If a 
charge density in a co-moving reference frame K’ is equal to po, 
the force density f’ is defined by the equation 


f’ a PoE’, 


where E’ is an electric field strength in K’. 

The transition to any other IFR is associated with the variation 
of the fields E and B; even if in a co-moving frame there was no 
magnetic field and only an electric one was present, a magnetic 
field will appear in any other IFR. Let us find the force density f’ 
expressed via the field components E and B in an arbitrary iner- 
tial frame. First, let us examine the case of non-relativistic velo- 
cities when T = 1; then according to Eq. (6.17) p = pI & po, 
and according to Eq. (6.41) E’ = E+ [VB], and because of this 


f=f =oE’ =o {E + |VB}}. (6.49) 


The last link of Eq. (6.49) defines the quantity which is usually 
called the Lorentz force density in electrodynamics. The Lorentz 
force defines the force acting on a unit volume containing a 
charge; this force is generated by the electric and magnetic fields 
of the frame K relative to which the charge moves at the velocity 
V. It is not surprising that the force f’ in the frame K’ turned out 
to be equal to the force f in the frame K since according to Eq. 
(5.34) a force magnitude does not change on transition from one 
IFR to another in a non-relativistic case. 

Of course, Eq. (6.49) can also be used in the case when the 
velocity of charge motion is different at various points in space. 
In this case each element of space will have its own co-moving 


* Point charges are discussed, for example, in [8], § 29. 
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reference frame and, consequently, the velocity V will be different 
at various points. 

Let us derive the expression for the Lorentz force by still 
another method illustrating explicitly how Eq. (6.49) comes about. 
Let the co-moving frame K’ have an electric and a magnetic field 
defined by the vectors E’ and B’. 

Making use of the superposition pririciple, we can describe each 
of these fields as a sum of two fields: 


1 Ei=0, Br=B; 
Il E:=E£’, Bs=0. 


Obviously, the initial field represents just the sum of the two 
fields: E’ = E{+ E:, B’ = Bi + B). However, the field transfor- 
mation equations | and II are very simple and allow us to get the 
answer right away. In the frame K’ 


f = poE’ = poE?. (6.50) 


Using the first equation of (6.45), one can write down imme- 
diately the electric field I in the frame K: 


E,=—[VB, 


where B, is the magnetic field in K. According to Eq. (6.36) the 
electric field If in the frame K is equal to 


E, = Exxt + r (Exyj + E32k) oe E; 


in the case when I’ = I. The total electric field in K is equal to 
the sum of E, and Ey: 


E=E, + E,= E> — [VB\). (6.51) 


The magnetic field B in the frame K is equal to B, + Bo. Hav- 
ing composed the vector cross product [VB] =[VB,]+[VB.] we 
see from the second equation of (6.44) defining B, that the prod- 
uct |[VB,]| ~ (V?/c?) can be ignored in a non-relativistic case. 
Therefore, [VB,]==-[VB] and we obtain the Lorentz force (see 
Eq. (6.49)) from Eq. (6.51). 

If in the frame K an electric field is equal to zero (E = 0) and 
a magnetic field differs from zere, it follows from Eq. (6.45) that 
E’ = [VB’]; so, the Lorentz force, appearing to be produced by 
a pure magnetic field in the frame K, seems to be produced in a 
co-moving frame K’ by a pure electric field. These examples show 
once again a uniqueness of an electromagnetic field and the rel- 
ativity of its division into an electric and.a magnetic field. 

A few words on lines of force of the field are relevant here. In 
each reference frame a vector field can be correlated with a family 
of vector lines of force. These lines are formally defined as curves 
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whose tangents at every point coincide with the direction of the 
field vector at that point. A line of force is a useful assisting 
notion allowing the properties of the field to be graphically exhib- 
ited. In contrast to the ideas of the last century, however, no one 
attaches any physical meaning to these lines now. 

Suppose, a charge or a constant magnet moves in space. Should 
one say in this case that a field and its lines of force move 
along? 

A field is a method to describe what happens at a given point 
in space. The magnet motion merely causes the field to vary in 





Fig. 6.3. The interaction of a charge g, moving at the velocity V parallel to a 

current-carrying conductor, and a current. (a) A conductor is at rest in the 

frame K, the charge and electrons move at the velocity V. (6) A conductor mo- 
ves at the velocity V in the frame K’, the charge and electrons are at rest. 


time at a given point. And still, one may speak of the field motion 
induced by a charge or magnet moving at a constant velocity, 
since this field moves with them as a whole. The field transporta- 
tion velocity is the velocity at which a charge or a magnet moves. 
A motion of lines of force, however, is better not to be mentioned 
since a motion velocity of lines of force has no physical meaning. 
An auxiliary nature of lines of force is demonstrated particulariy 
well by the fact that they may just disappear in some reference 
frame for a certain field. 

Here is another example illustrating the relative character of 
forces acling in an electromagnetic field. Consider a cylindrical 
conductor carrying a current and a negative charge q moving 
parallel to the conductor at the velocity V (Fig. 6.3). We shall 
fix the frame K to the conductor, and the frame K’ to the charge. 
In the frame K the charge experiences the Lorentz force induced 
by a magnetic field and directed at right angles to the conductor’s 
axis. Consequently, the charge approaches the conductor. In the 
frame K’, however, the charged particle is at rest and the magnetic 
field has no effect on it. Then, what is the reason causing the 
charge to deviate in terms of the frame K’? 

Here one needs to review a microscopic description of what is 
happening in a conductor. A current originates in a conductor 
due to the motion of free electrons since positive ions and fixed 
(valence) electrons cannot migrate along a conductor. Let the 
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density of conduction electrons be equal to p_, with their velocity 
in K (relative to the conductor) equal to v_. The density of sta- 
tionary conduction electrons is equal to p+, and due to the neutral- 
ity of the conductor p4-+- p- = 0. Inasmuch as the conductor is 
neutral, there is no electric field outside of it, and the force acting 
on the charge q arises only from a magnetic field: 


F = (VB): 


The magnitude of the magnetic field induced by a rectilinear cur- 
rent at the distance r from its axis is known: 





Hol 
a= Qur° 


the vector B coincides with the tangent of a circle lying in a 
plane perpendicular to the current’s axis and having its centre on 
it. The direction of the vector B is determined by the right-hand 
screw rule. Hence, the force acting on the charge is directed 
toward the conductor and is equal to 


Qnr — 4megc? srr 


The current can be expressed via the conduction electron ve- 
locity v_, their density and the cross-sectional area S: 


[=jS=po_v_S, 
whence 
q p_S Vu q p,S y2 


provided that for the sake of simplicity we assume the velocity 
of electrons in metal to be equal to that of the charge gq, ic. 
V =v. 

Now let us consider the same situation in the frame K’. The 
charge q and conduction electrons are at rest in K’. This time, 
however, the charges connected with the conductor (and whose 
density is equal to py) move relative to the charge g. Although 
they induce a certain magnetic field B’, it does not act on the 
charge q any more, since the charge is motionless in K’. Whence 
it is clear at once that an electric field must appear in the frame 
K’ since the charge must deviate toward the axis in K’ as well. 
Its origin is easy to understand from the results obtained by us 
earlier. Conduction electrons are at rest in the frame K’, and 
thercfore p_=Tp’_ (see Eq. (6.17)). Positive charges connected 
with the conductor move at the velocity —V in the frame K’, and 
so p,=I'p, (these charges were stationary in K). The resulting 
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charge density p’ is equal to pe, +e. in the frame K’, and, con- 
sequently, 
pe =p_/ +i'p, =e, 7—-i)= Up, B’, 


where the relation py = — p_ is allowed for; this equation coin- 
cides with Eq. (6.24). Consequently, a moving conductor is 
charged positively with the space density p’. But an electric field 





(a) 





Fig. 6.4. (a) In the frame K the charge density p (p = p1+p-_) is equal to 

zero while the current density is equal to j #0. Therefore, an electric field is 

absent in the frame K and there is only a magnetic field B. (6) In the frame K’ 

the density of charges that emerges is equal to f: and the current ene be- 

comes equal to j’. A magnetic field is equal to B’; besides, an electric field E’ 
also appears. 


of a uniformly charged cylinder is also known from electrody- 
namics. It falls within the planes perpendicular to the cylinder’s 
axis and is oriented along the rays leaving the cylinder’s axis. Its 
magnitude 
P= p’S = p, STB? 
2neor 2neor 


This implies that the force acting on a negative charge q is 
directed toward the conductor, and its magnitude in the frame K’ 
is equal to 

Ss 
F’ = gE’ => + 


a 2neo r 


PB, 








Comparing this result with Eq. (6.52), we see that these forces 
are equal in a non-relativistic approximation (T ~ 1). Recalling 
that the forces transform according to Eq. (5.34), we find that 
both ways of describing an observed phenomenon give identical 
results at any velocity V. The results pertaining to fields in the 
frames K and K’ are explained in Fig. 6.4. 

In conclusion it should be emphasized that all the results per- 
taining to forces which an electromagnetic field exerts on space 
charges, are obtained quite easily provided that the Lorentz force 
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density (see Eq. (6.49) ) 
f= e{E + [vB]} 


is written in a four-dimensional form. To pass over to the four- 
dimensional notation, rewrite the x-component of the Lorentz 
force as follows: 


fe =f, =F, + pv,B, — pv,By = 


=(—*)Fytott—s(- = 


l 1 
= = (FiaSe + FigSa + FigS) = > Friese 


The following relations are taken into account in this chain of 
equations: 


i 
P= — 7S PUy = S2 002 = S3, 


F F 
ft, bya FB, Fume 


Analogous expressions are obtained for fy =f, and fz = fs. 
Hence it is clear that the 4-vector of a force density acting on a 


charge in an electromagnetic field, to be denoted by j. has the 
components * 


l 
f= > Puss (6.53) 


We have already pointed out that the separation of forces exert- 
ed on the charge by electric and magnetic fields into the parts o£ 
and p[vB] is relative. Both these forces constitute a united whole 
combining naturally into a single four-dimensional expression 
(Eq. (6.53) ). 

As we have seen, the first three density components lead to the 
conventional three-dimensional equation (6.49). Let us find the 
fourth component: 


1 | i 
4 PF apSe = TF (Fass + FaaSe + Fag83) = =. (vE). 


The quantity p(vE) has a simple meaning which is immediately 
seen as soon as the two sides of Eq. (6.49) are multiplied scalar- 
wise by v. Taking into consideration that [vB]v = 0, we get 


(fv) =p (vE). 


“ We expect that the reader will not forget that the letter F supplemented 
with lwo subindices represents a lensor f component. The components of the 
4-force densily have a single subindex. 
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The left-hand side of the last equation represents the power 
of a Lorentz force per unit of volume (forces exerted by a mag- 
netic field perform no work): 


f= (fo). 


Thus we have obtained a 4-force density vector whose compo- 
nents are written down together as follows: 


? fi fo fs in 
tr fy fe ia (6.54) 


Let us consider the force exerted by an electromagnetic field 
on a unit of volume containing a charge po in the reference frame 


K® co-moving with the charge. Then f(poE’, 0). Passing over to 
any other reference frame K, we get 


h=h fmt heathy f= — Br =i ST pey. 


Here the equations for a force density in the frame K are ex- 
pressed through the fields in the frame K’. Usually a force density 
is expressed in terms of the quantities referred to the frame in 
which the force density 1 is determined. Making use of Eqs. (6.17) 


and (6.36), we obtain F(p(E +[VB]), ile p(Ev)) for non-relati- 
vistic velocities (ignoring the terms V?/c?). 

In conclusion we shall write out the motion equation for a 
charged particle in a four-dimensional form: 


d l 
nm (mu;) = va F jpSp- (6.55) 


§ 6.7. Covariance of the system of the Maxwell equations. The 
Maxwell equations define the behaviour of an electromagnetic 
field in the most adequate manner. They were proposed long be- 
fore the advent of the theory of relativity and surely before the 
Lorentz transformation was identified. According to the principle 
of relativity the appearance of the Maxwell equations must remain 
constant in all inertial frames of reference. Consequently, the 
Maxwell equations must be covariant relative to the Lorentz trans- 
formation. It so happened that the system of Maxwell's equations 
satisfies these conditions when written in the form proposed by 
its creator. To ascertain this, the system of Maxwell’s equations, 
written usually in terms cf three-dimensional equations, should 
be rewritten in a four-dimensional form. Now we shall be occu- 
pied with just that. The system of the Maxwell equations is known 
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to have the following form: 
rotH=j+D, (a) | divD=p; (b) (6.56) 
rotE=—B, (a) | divB=0. (b) (6.57) 


We have split the equations into two lines, having combined the 
equations involving the average values of an electric and a mag- 
netic field E and B, and the equations for the subsidiary vectors 
H and D. 

In order to present Eqs. (6.56) and (6.57) in a four-dimensional 
form, we shall need the tensors (6.29a) and (6.31); we shall also 
make use of the definition of a 4-current density vector (6.12a). 
Note for the future use, by the way, that tensors § and f are linked 
in vacuo by the relation 


B= 4/24, (6.58) 


Naturally, Eqs. (6.56) can be expressed via tensor (6.31) while 
Eqs. (6.57) via tensor (6.29a). 
Let us consider the x component of Eq. (6.56a): 
aH, | OM, ; 
oy} Oe le (6.59) 








D, = 
Recalling that according to Eq. (6.12a) js = s, and using the 
first line of Eq. (6.31) together with the definitions x; = x, x. = y, 


X3 = Z, x; = ict, we shall rewrite Eq. (6.59) as — Jie —_ Sie _ 
— she = — s,. The two other components are given by similar ex- 
pressions which can be written in a general form as (i = 1, 2,3): 
of 
ie, = Se (6.60) 


where the summation is performed over & running from 1 to 4. 
It is readily seen that when i = 4, we get Eq. (6.56b). Thus Eq. 
(6.56) is rewritten in the form of Eq. (6.60), but now in terms of 
4-tensor components (6.31). 

Now let us consider the components of Eq. (6.57). For example, 
the x component of Eq. (6.57a) can be written as 


oE, aE, 
Belay = ge (6.61) 


Resorting to tensor (6.29a), one can rewrite Eq. (6.61) as 
follows: 





. ( OF OF. OF 
i(S$2+32+32)=0. (6.62) 


OX, Ox3 
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It 1s not difficult to notice that consecutive terms in Eq. (6.62) 
are obtained via a cyclic transposition of the three indices in each 
of the preceding terms. The structure of Eq. (6.62), however, be- 
comes quite evident if we introduce the tensor Fi, which is dual 
to the tensor F,, (see Appendix I, § 6): 


° 1 F 


Fin = skim (6.63) 


im 
where @:zim 1S a fully antisymmetric unit 4-tensor of the fourth 
rank. One can easily see that the dual tensor Fi, differs from the 
tensor F,, only by transposed components of the imaginary and 


real parts: 
S=(cB, —iE), 3 =(—/E, cB), (6.64) 
or, written in full, 
0 —iE, iE, cB, 
: iE, 0 —iE, cB, 
Fx = Zig, iE, 0 | (6.65) 


—cB, —cB, —cB, 0 / 


Using this tensor, the pair of Maxwell’s equations (6.57) can be 
rewritten in a four-dimensional form as follows: 
OF ip 
ax, —° (6.66) 


Let us make sure that Eq. (6.66) corresponds to the four equations 
of (6.57). 

Eq. (6.66) contains four equations (i = 1, 2, 3, 4). Consider, 
for example, the equation for i = 1: 


OF ip OF, OF iy OF i OF i, 






































Ox, OX, +x, +e, + oe, = 
OE, | | OE, | A(cB,) 
rig tl oe T oune 

or othei wise 

Obs... OE, OB, : 

Oy ~ Sa. OO Le. (rot E), = — (B),. 
Eq. (6.66) yields the two other components of the equation 
rot E = —B at i= 2, 3. For i=4 we get the following equa- 
tion: 
OF i, OF yy, OF yp OF, OFy 
Ox, = OX, Ox, Ox, - Ox, 
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representing kg. (6.57b): div B = 0. So we see that Eq. (6.66) 
comprises Maxwell’s equations (6.57). 

Quite often Eq. (6.57) is expressed directly through the tensor 
Fir. We present this notation here since it will be necessary later 
on. The Maxwell equations (6.57) can be legitimately written both 
in the form of Eq. (6.66) and as an equation of the type obtained 
in (6.62): . ; 

OF ip ts OF py + a 











ae =0. (6.67) 


Ox, 
There is no summation involved in Eq. (6.67). Three different 
values of the indices i, k, ! are to be chosen from the four pos- 
sible ones. The reader can make sure himself that if two of these 
indices are taken to be the same, and antisymmetric properties 
of the tensor §(Fiz = — Fr.) are allowed for, Eq. (6.67) is seen 
to turn into an identity. The structure of Eq. (6.67) shows that 
the distribution of the chosen triad of numbers among the indices 
i, k, U is insignificant. This implies that Eq. (6.67) contains sev- 
eral independent equations, their number being equal to the 
number of possible combinations, each containing three indices, 
that can be formed from a collection of four indices, that is 
«C3 = -C, = 4. We let the reader make sure for himoelf that the 
four equations (6.57) follow from Eq. (6.67). 

Now the Maxwell equations can be readily proved to be co- 
variant. We have seen that they can be written as Eq. (6.60) and 
(6.66) or (6.60) and (6.67). But Eqs. (6.60) and (6.66) represent 
the relations between 4-vectors since the expression Of.2/Oxe is a 
vector (see Appendix I, § 5). Eq. (6.60) differs from Eq. (6.66) 
by the zero vector featured on the right-hand side of Eq. (6.66). 
As to Eq. (6.67), it is explicitly presented in a tensor form and 
consequently is covariant. Thus, the four-dimensional notation of 
Maxwell's equations itself indicates their covariance. 

The system of Maxwell’s equations is not confined to Eqs. (6.56) 
and (6.57). We have already quoted the charge conservation law 
in a covariant form (see p. 184). It remains only to rewrite the 
“material equations” in a covariant form. 

§ 6.8. The Minkowski equations for moving media (the trans- 
formation of material equations). In the previous section we saw 
that the system of Maxwell’s equations (6.56) and (6.57) retains 
its appearance in all inertial frames of reference. However, the 
Maxwell equations yield an unambiguous picture of electromag- 
netic phenomena only when material equations, characterizing a 
medium in which electromagnetic phenomena occur, are specified. 
As usual, the reference frame in which the medium (or its portion) 
rests will be referred to as a co-moving one. In the case of a uni- 
form isotropic medium the material equations have the following 
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form in a co-moving frame: 


D’=ecE’, (6.68) 
B’ = pH’, (6.69) 
|’ =0€’, (6.70) 


with a permittivity e, permeability » and conductivity o all being 
constants. Consider the motion of a medium relative to a “labor- 
atory” frame. In the frame co-moving with a medium the Maxwell 
equations for a stationary medium are valid. Due to the principle 
of relativity the material constants e, y, o must be the same both 
in a stationary medium in a “laboratory” frame and in the ref- 
erence frame co-moving with the medium. Inasmuch as the trans- 
formation equations for the vectors E, B, H and D are known, 
the relationship between them can be found in any other inertial 
iraine differing from K’. Let us write out the necessary transfor- 
mation equations, having split them into the longitudinal and 
transverse (relative to the reference frame velocity V) parts (sec 
§ 6.4): 

E\=(E+(VB)), EF, =T(E+(VB]),; (6.71) 


Bi=(B— qlVE)) . B, =T (B——IVE]) (6.71’) 
Di=:(D+—IVA)) , D=T(D+—IVH)) ; (6.72) 
H, = (H—[VD)),, H’,=T(H—[VD)).,. (6.73) 


We should recall once more that all expressions of the type 
[VA] are equal to zero for any A since we deal with projections 
on the velocity direction and a vector cross product is perpen- 
dicular to the velocity V. If the corresponding expressions are sub- 
stituted into Eqs. (6.68) and (6.69), we obtain the identical re- 
lationships for both the longitudinal and the transverse compo- 
nents (in which the factor T cancels out). These relationships can 
be combined as follows: 


D+—7[VH|=e(E +[VB)), (6.74) 
B——,[VE]=u(H —[VD)). (6.75) 


Eqs. (6.74) and (6.75) are called the Minkowski equations: e 
and p» appearing in these equations represent a permittivity and 
permeability of a resting medium. These equations differ essen- 
tially from Eqs. (6.68) and (6.69) in that they involve all of the 
field vectors simultaneously. Using Eq. (6.75) one can easily elim- 
inate B from Eq. (6.74) and obtain an equation involving only 
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the three vectors E, D, H, or eliminate D from Eq. (6.75), using 
Eq. (6.74). 

The equations appear simpler when put down separately in 
terms of longitudinal and transverse components: 








D,;= cE, B, = pj, (6.76) 

(-~ 4) Du =e(1 —4) Bi + (eu — com) (VA, 
6.77 
(1 - sh) Bi =n (1 — 4) Hi + (en — cone) (VE. me 


The first equation of (6.77) is obtained from Eq. (6.74) into 
which the expression for B is substituted from Eq. (6.75) and then 
only a transverse component of the relation obtained is taken. In 
much the same manner one obtains the second equation of (6.77). 

It is seen from these equations that if the vectors B and H, as 
well as D and E, coincide in direction in an isotropic medium in 
a co-moving frame Kk’, this is not the case in other reference 
frames. 

Of course, when one examines the motion of a medium, the case 
of non-relativistic velocities proves to be most interesting. Hence, 
if one ignores the terms V?/c? and n?V?/c? compared to unity 
(n =-~/en/eouo is a medium refraction index, sce Chapter 7) in 
Eq. (6.77), as Eqs. (6.76) and (6.77) take a simpler form: 


D=ecE+— 7 (?—1)[VH', B=uH+—5(n?—1)[VE]. (6.78) 


This form of material equations written for a moving medium is 
employed very often. Due to the equivalence of all IFRs in vacuo 
the last equations turn into Eqs. (6.68) and (6.69) in this case. 

It is helpful to rewrite the material equations (6.68)-(6.70) in 
a four-dimensional tensor form. We shall not derive these equa- 
tions; we shall just write them out to check that we get Eqs. 
(6.68) -(6.70) in the frame in which the medium is at rest. Let us 


introduce a four-dimensional velocity V (CV, ic!) for a medium. 
([ is written here since the velocity of an object or a medium V 
is assumed to be equal to that of the reference frame K’). In a 


co- moving frame K’ the 4-velocity V has the components Uj =0, 
Us= => 0, u= 0, Uu= = ic. 
The reader can easily verify thal the tensor equations 


~fue= CF Uy (6.79) 
+ (Fins + Fes + PiUe) =e Fei + feiUi + fis), (6.80) 
5 = 2 FiUe, (6.81) 
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> 
lead to Eqs. (6.68), (6.69) and (6.70) respectively if V’ compo- 
nents are substituted into them. 

In Eqs. (6.79) and (6.80) the summation is carried out over k. 
The total number of equations amounts to 4, but since at i = 4 we 
get an identity, there are only three equations, in fact. In Eq. 
(6.80) one has to find the number of possible combinations of t, & 
and / that can be formed from four values 1, 2, 3, 4 taken three 
at a time. This number is equal to ,C3 = 4, but since the combi- 
nation 1, 2, 3 yields an identity, we come back again to three 
equations as it should be. Having derived the correct expressions 
for material equations in the frame K’, we prove that a tensor 
notation is also correct. 

We shall illustrate an application of a tensor notation by the 
example of Eq. (6.81). Suppose that a current density is observed 
in a co-moving frame K’ (in which the medium is at rest) and the 
charge density is equal to zero, i.e. s; (ij, i ij, 0). The velocity 


of a medium in the frame K’ is equal to Vy’ (0, 0, 0, ic). Here are 
the s; components: 


si = DFU, = SFU = = (— iE) (ic) = oE;, 
i.e. jj = o£); similarly /,—= o£), };=0E;. The fourth component 
= + FU, S0 (si—icp’=0, Fi,= 0). 
But in the reference frame relative to which the medium moves 
8) = SFU (Fie + Fis + Fis) =< (—iE,) ic? =o E,, 
Sp = FU, = (Fo + Fos) = = (— cBl'V + (— iE, ic) = 


sy =o {E, + IVB},}, ce 
since 
U,=TV, U,=U;=0, U,=icl. 
The final result is obvious: 
j=ol (E+ [VB)}). (6.82) 


Its meaning is quite clear: the current density in the medium 
with a conductivity o is determined by the magnitude of an elec- 
tric field in this medium; in accordance with Eq. (6.41) the magni- 
tude of an electric field makes its appearance as a factor by o in 
the case of f = 1. 
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The fourth equation defines the charge density associated with 
the conductivity current: 


51= iP cona =F Fue= SEV) =iT (4), 6.83) 
or 
Vv 
Pcond =F se 


in complete agreement with Eq. (6.24). 

It is worthwhile to consider Ohm’s law in the case of moving 
media, i.e. the material equation (6.70). We shall see that the 
convection current pv and the conductivity current are closely 
interlocked as it becomes obvious right after j and icp are com- 
bined into a single 4-vector. The difference between a convection 
current and a conduction current is caused by the choice of a ref- 
erence frame. Therefore it is natural that both currents alike in- 
duce a magnetic field. 

We shall assume that a conductivity current represents a mo- 
tion of charges with respect to a medium whereas a convection 
current arises due to the presence of charges in a medium owing 
to the motion of this medium. 

Suppose that in a certain frame K’ there is a conductivity cur- 
rent j’ = oE’ and, besides, a charge density p’. These quantities 
constitute jointly a 4-current which can be transformed to any 
reference frame by means of Eq. (6.!5a). Having expressed the f 
components and the density p through j’ and 9’ in the reference 
frame K’, we obtain 


: 7 , : 7 : 7 af Ve 
i=l, +V0'), =i, =i, o=P (etait). 684) 


It is seen from the first formula of (6.84) that a conductivity 
current jx incorporates a convection current [Vp’ = Vp so that it 
is not proportional to o any more. It is inconvenient because at 

= 0 a conductivity current must turn into zero. How to distin- 
guish a conductivity current in the general case? To do this, one 
must recall that if in the frame K’ there is a charge density 
p’ = po we obtain a 4-current density in any other frame K (see 
Eq. (6.20)) 

Sion? = Poll, (6.85) 
where u, is a 4-velocity of the charge. This current should be 
called a convection current; accordingly, s, in Eq. (6.85) is sup- 
plemented with a superscript “conv”. Suppose, we have a 4-cur- 
rent whose components are s,, and we want to represent it as the 
sum of a conductivity current and a convection current. First of 


all, let us express po through 3 and v. Having multiplied both 
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sides of the equation s, = pyu, by the corresponding components 
u, and added up, we get s,u,=p,u3; but according to Eq. (5.7) 


uz = — c*, so that 
po=— at. (6.86) 





Consequently, the convection current can be put down as 


£00 ae — HEE, (6.87) 





In order to obtain components of a 4-conductivity current, one 
has to subtract components of Eq. (6.87) from s,: 


a u,. (6.88) 





scone = Ss; = scent? == S; + 

On the other hand, in accordance with ma (6.81) the quantity 
s¢o"4 can be put down as 

gro = — oF iglly: (6.89) 


Having equated these expressions, we obtain: 


Si + at = < F jpltp. (6.90) 


Utilizing the definitions s(js jy. jx icp), Viyoe yoy, Yve, icy), we 
get 


j+y°o(4 — p) oy {E + [oB}} (6.91) 


in a three-dimensional forin. 

Let us separate the terms proportional to a conductivity o in 
Eq. (6.91). For this purpose the left-hand and right-hand sides of 
Eq. (6.91) are multiplied by v. Introducing the usual designa- 
tinns y and B, we get 





fe uae age if 
pp Soe. 
or oe 
fe p=— 242s. (6.92) 
Substituting Eq. (6.92) into Eq. (6.91), we finally obtain 
j=ov + oy {E+ [VB] — 5 (Eo)}. (6.93) 


Thus, the term “conductivity current” can be attributed to the 
quantity j°o"¢? = j —pv. A field existing in a substance moving 
relative to a given reference frame is often denoted by E* 


= E+ (VB). (6.94) 
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Then Eq. (6.93) can be rewritten in the following form: 
prona =oy {E" _ + (E'v)} (6.95) 


This equation resembles very much the force transformation 
equation (5.35). To complete the transition to equations of mov- 
ing media electrodynamics, one has to find out how to describe 
boundary conditions when a media interface moves. The continuity 
condition for normal components of induction follows from the 
equations divD=0 and divB=0, which, according to Eqs. 
(6.66) and (6.60), keep their appearance on transition from one 
inertial frame of reference to another. Therefore, at the interface 


Dai — Dy2 Bn = Bro. (6.96) 


Let us examine now the boundary conditions for tangent com- 
ponents of field strengths. Considering first the reference frame K’ 
co-moving with the interface, we obtain the continuity condition 
for tangent components of E’ and H’ in this frame. But in terms 
of the frame K relative to which the interface moves at the veloc- 
ity uw, the field) E and H take the following form (see Eqs. (6.41) 
and (6.43)): 

E’=E-+([uB), H’ =H —[uD). (6.97) 


Let us draw a perpendicular a to the interface plane and denote 
the projection of the velocity w on this perpendicular by up. Let us 
find the projections of Eq (6.97) on the plane perpendicular to a. 
Remembering that [nE] =[a, E, + E:] = [aE;], we shall write 
the equation Ey) = Ej. as [nE7)] = [aE pl, i.e. 


[nE,] + [a [2B,]] = [aE>] + [2 [a Be)], 


or 
[a, E, — E\]) =u (a(B, — B,)) + (B, — Bj) (un). 


Since according to Eq. (6.96) 2B, = nBo, we finally obtain 


[a, Ey — E,] =u, (B, — B,), (6.98) 
and similarly, 
[a, H, — H,] = — uy (D2 — D)). (6.99) 


This equation, together with Eq. (6.96), constitutes the boundary 
conditions for field vectors. 

§ 6.9. The transformation of electric and magnetic moments. If 
we combine an electric and a magnetic moment P and M into a 
single antisymmetric tensor (6.33) we can immediately write trans- 
formation equations for components of these quantities. 
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Let us denote a polarization and magnetization, determined in 
the reference frame co-moving with substance, by P® and M?® re- 
spectively. Then an observer relative to whom the substance moves 
at the velocity V will get 


M,=My, M,=0(M,+VP2),  M,=! (M2 —VP}) 
V V 
Pr=P2, Py =0(Py-aM2), Pe=I(P2+—M). 
(6.100) 


These equations clear up at once the relationship between the 
three-dimensional vectors P and M introduced earlier. Here one 
can repeat everything that was said about the relationship between 
electric and magnetic fields. As a rule, magnetization is always 
accompanied with polarization, and vice versa. P or M can he 
equal to zero only in a specially chosen coordinate system. A pol- 
arized but not magnetized object is both polarized and magnetized 
in terms of an observer relative to whom this object moves. In- 
deed, suppose that in the frame K’ in which the object is at rest 


M°=0, P°(PS, Py, Pz) *0. 
Then in the frame K relative to which the object moves at the 
velocity V, 
P,=P%, Py=IP®, P,=IP% 
<=0, My=PvP?, Mz=—IVP%. 
Consequently in the frame K magnetization of the object will be 


observed. If the object moves at a non-relativistic velocity, i.e. 
Vice< land?’ xl, 


P=P°, and M==[PV]. 
This effect was found in experiments performed by Eichenwald 


(see [13], [29]). On the contrary, if in the frame K’, relative to 
which the object is at rest, 


P°=0, M°(Mi, Mj, M2) <0, 
then in the frame K relative to which the object moves at the ve- 
locity V 
M,=M., M,=IMj, M,=TM32, 


6.101 
P,=0, P»=—P aM’, P,=TaM}. oe 
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Consequently, the object turns out to be polarized in the frame K. 
If the object moves at a non-relativistic velocity, then 


M=M, P=—[M+]. 


This implies, for example, that a moving permanent magnet car- 
rics an electric moment giving rise to the phenomenon of homo- 
polar induction utilized in electrical engineering. 

Here is an example illustrating these conclusions. Let the density 
of a current flowing along the rectangular loop ABCD be equal 
to j, and the loop itself move at 
the velocity V relative to the 
frame K. Let us fix the frame K’ 
to the loop (Fig. 6.5). In accor- 
dance with Eq. (6.24) a charge 
p > 0 appears in the section BC 
and a corresponding charge 
p <0 in the section AD. It is 
obvious that the total charge 
appearing in the loop ABCD is 
equal to zero. At the same time 
this loop possesses an electric 

6.5. ; . moment directed along the y 
tet Ft aly ioe honecem: axis. We shall show that an ele- 
ered in the reference frame K relalive mentary calculation coincides 

to which this loop moves. with conclusions of the STR. 
Although there is no dipole mo- 
ment in K’ and only the zth component of M is present, P, = 


=— rs M° emerges in K according to Eq. (6.101). In the frame 


K’ the rectangular current ABCD possesses the magnetic moment 
IS where the vector S is directed toward the negative z axis and is 
equal to ab (a and b being the sides of the rectangular loop). As- 
suming for the sake of simplicity the cross-section of the conductor 
to be equal to unity, we obtain M[= — j°ab. The electric dipole mo- 
ment emerging in the loop is not difficult to calculate. According 


to Eq. (6.24)p =P + j®, the distance between BC and AD is equal 


to 6, and the total charge in these sections is equal to pa. The di- 
rection of this dipole moment coincides with that of the y axis. 


Therefore, P, =pab =. + abj? = — + rM?°, just as it should be. 


z’ 

§ 6.10. Some problems involving the transformation of an elec- 
tromagnetic field. The field of a uniformly moving charge. The 
magnetic and electric fields of a uniformly moving charge are most 
easily obtained by transformation of the fields existing in the 
frame K’ in which the charge is at rest. In the case of a point 
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electric charge e resting in the frame K’ we face an electrostatic 
problem since such a charge produces only an electric field. 
However, when the same charge is considered in terms of the 
frame K moving at the velocity —V relative to K’, it is found to 
generate a rectilinear current. A magnetic field induced by a recti- 
linear current is very well known: lines of force of such field forin 
circles whose centres coincide with the current; the planes of these 
circles are perpendicular to the current direction. Naturally, these 
results follow from the field transformation equations. 
Now, let a point charge be located at the origin of the frame K’. 
Then in this frame ; 
B’=0, EF = ~~ 


4ne r°° 
or, when expressed in projections on the coordinate axes, 
B,=0, B,=0, BL=0, 


, ex’ / e y' , e 2 
a "Ane r’3? y 4ne rs? 2 4ne 773? 


, 


where r’? = x” + y’?+4 2”. According to Eq. (6.36) we obtain in 
the frame K 
E,= Ey E,=TEy E,=TE» (6.102) 


Be=Be=0, By=—T-PE:, Be=TEE, (6.103) 


As B, = 0, the magnetic field in the frame K is located in the 
planes perpendicular to the x axis, i.e. in the planes perpendicular 
to the current direction. The equations describing the lines of force 
of the magnetic field take the following form in the frame K: 





dy dz dy By 
‘By Bz’ or GB: 
But ; 
Bele a 
i ae ae 


since 2’ = z and y’ = y under the Lorentz transformation. Conse- 
quently, the differential equation for the lines of force takes the 
form dy/dz=— 2/y, or y dy + zdz=0, i.e. d(y? + z?) =0. Hence, 
it is obvious that we have the equation of a circle y? + 2? = const 
in the capacity of a first integral. Consequently, the lines of force 
represent circles with centres located on the current axis. 

Surely, one can transform not only fields, but potentials as well. 
In the frame K’ the scalar potential is equal to 


’ le ¢y 
oa 
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whereas the vector potential is equal to zero: A’ = 0. If the 4-po- 
> < 
tential @ has the components (4. <9’) in the trame K’, its com- 
ponents in the frame K take the following form in accordance 
with Eq. (6.14a): 
DH=C(O-HO), =O =O} 
, Vv 7 
®,=1 (01+i—@). 


Substituting the values of the 4-potential components in the 
frame K’, we get 


. .V i ta V f j tA 
A=I(-iti@’)=Iee, 4=0, 4=0, Sp=rtg. 


Thus, 


A= Ty =—+9, g=I¢’. (6.104) 


2 < 
a< 


Now we have to express r’ entering into g’ via the charge coordi- 
nates in the frame K. According to the Lorentz transformation 


x’=T(x—V), y’=y, 2 =z, (6.105) 
and the expression for r’” will be written as 
rs yh oy? th 2? (xy —VIP +t yt 


=M[(x—vey4 FEF |ore, vr, 6.106) 


where the following designation is introduced: 
2 
R= (x—VvyP+(1- +) +2). (6.107) 
Making use of Eq. (6.107), one can express the scalar potential , 


defined in Eq. (6.104), through ®: 


o=l9 =a rr 4ne R* 


Accordingly, the vector potential A can be written in the form 
V 1 eV 
A=—a 0 te OH: 


Let us rewrite the expressions for the E components of the 
field, taking into account Eqs. (6.102), (6.105) and (6.107). We 
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get: 
_ e(x—Vb) 
x ~~ [T24neR3 ° 
e 
Ey =7aRa (6.108) 
ez 
EL Vinei 


In the frame K’ the charge is located at the origin O’ (i.e. at the 
point x’ = 0). Its coordinates at the moment ¢ in the frame K will 
be as follows: x9 = Vt, yo = 0, 29 = 0. Let us introduce one more 
vector, R, directed from the point O’, where the charge is located 





z-Vt=R-c0sd 


Fig. 6.6. To the calculation of an electric and a magnetic field of a uniformly 
moving charge. 


toward the observation point A whose coordinates are (x, y, 2) 
(Fig. 6.6). The vector R will take the form 


R= (x — Vi)it yj + 2k, (6.109) 
where i, j, & are unit vectors along the x, y, z axes respectively. 
Having multiplied the components of Eq. (6.108) by é j, and k 


respectively, we obtain 


; : i R 
E=E,i+Ej+bk=ZeRq- 


If one introduces the angle @ between the charge motion direction 
(ie. the x axis) and the radius vector R, then 
x—Vt=Rcos6, R?= R*cos?6+ y?+ 2’, 
and consequently, 
y+ 2?= R’ sin’ 6. (6.110) 

Taking into account Eqs. (6.109) and (6.110), one can rewrite 

Eq. (6.107) as 
2 
R= R?(1—psin’6), 
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whereupon the expression for E can he finally represented in the 
form 
v2 
_ 1. eR la 
= RT. wv... \s2° 
Ae (1-4 sinto ) : 
¢ 


E 





(6.111) 


Eq. (6.111) presents the electric field of a moving charge in very 
convenient variables, i.e. the distance R from the moving charge 
and the angle 6 formed by the direction to the point at which the 
field is sought and the charge motion direction. Eq. (6.111) shows 
that the magnitude of the field depends on the angle 6. Al a 
fixed R the minimum magnitude of the field corresponds to the 
charge motion direction (@ = 0, x): 


Eanes (I ~ =), 


and the maximum magnitude of the field is observed in the direc- 
tion perpendicular to the motion (@ = x/2): 


I e 1 


1 ane R? VI—V 4c? 


A field strength magnitude depends on a charge motion velocity, 
with Ey decreasing and E, growing as the velocity increases. The 
electric field of a charge moving at a relativistic velocity is local- 
ized within two narrow solid angles whose boundary surface is 
approximately determined from the relation (V? sin? 6/c?) ~ 1; the 
axial line of these solid angles is perpendicular to the charge mo- 
tion direction. 

The magnetic field B of a charge moving in the frame K can be 
found by means of Eq. (6.44) (B’ = 0 in the frame K’): 


B=- [VE]. (6.112) 


When the velocity of the charge is low, the fields in vacuo are 
described by the following approximate relations: 
1 eR 
c= Anes Rp 
and 





Ail e[VR] _ po [eVR] 
~ 4megc? RR? ~~ 4x RI ° (6.113) 
Eq. (6.113) represents the Biol-Savart law. 

The interaction of two moving charges. Let two charges e, and 
€2 move in parallel at the same velocity V. Let us determine the 
interaction force between them in the reference frame K relative 
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to which they move. First, we shall find the force acting on the 
charge é. 

The charge e, experiences the action of an electric and a mag- 
netic field induced by the charge é:. The force acting on the 
charge e, is the Lorentz force: 


F, =e, {E, + [VB,}}. 
Taking into account Eq. (6.112), one can write down 


F, = e\E> +4 ~([V (VE,]] = = eB, +4 SV (VE,) — a <> V?Ep = => 


=e(1- ¥) ar (VE;). (6.114) 


E, can be found from Eq. (6.111) in which R is assumed to be 
the radius vector drawn from the charge é2 to e; and 6 the angle 
between R and the charge motion velocity direction V. Substituting 
Eq. (6.111) into Eq. (6.114), we obtain 


V ve 
FP C1e2 (-") R eieVV cos 6 (1—— ) _ 
uae cae 
4neR (1 = sin?® ) anectR? ( — 4 sinte)" 
y2 


ele c? 


R 
= sot ON cos 
Ste (me gpal('— B)E + 
whence the component along the motion direction is 


2 
(1 — 4) cose 
Fo = 212 c 
* 4neRé (1 oo Ve sinto 
C2 


and the component perpendicular to the motion direction 


(1-3) sin8 
— F162 


¥— GneR? 372° 
AmeR 0-4 sin?) 





(6.115) 


Let the charges be located on a straight line parallel to the y 
axis, with one of the charges being at the x axis, so that the dis- 
tance between the charges is equal to y. Then 6 = 12/2, F, = 0, 


and 
eee 
Fy,= iney? fi-% -=> (6.116) 


This equation can be obtained very Saas In the frame K’ 
where the two charges are at rest, the interaction between them is 
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of electrostatic nature and the interaction force 1s equal to 
€\€2/4ney?. The transformation of this force on transition from the 
frame K’ to the frame K by means of Eq. (5.35) yields Eq. (6.116). 
According to Eq. (6.116) the charges repel each other in the frame 
kK. But in the frame K the charges move producing two parallel 
currents flowing in the same direction. When such currents flow 
along conductors, they attract each other. There is no contradic- 
tion here since the physical situations are different. Let us consider 
the expression for a force (Eq. (6.116)) in vacuo in the case of 
non-relativistic Bai V: 

een e1e2 e1e2 vy? 

weg (loge t)= age car Hog 6.117) 

On the other hand, following Ampere, the interaction force be- 

tween two current elements e,V and e2V in vacuo can be written 
as 


F. = 4h fe:V [e2V, Ri] __ poerer [V [VR] — Moeie2 yo _R z 
"4x Rs 4x Re 4% . 
Taking into account that R = yj, we finally get 
preie 
Fj=— ag (6.118) 


The force defined by Eq. (6.117) and observed in the reference 
frame K relative to which the charges move, consists of the Cou- 
lomb repulsion and the Ampere attraction (with an accuracy of the 
factor 1/2). The force expressed in the form of Eq. (6.117) can be 
used for the explanation of current interactions in conductors only 
with certain stipulations. Neutral current-carrying conductors must 
attract each other in these conditions. However, a current-carrying 
conductor is neutral only in one reference frame (§ 6.1). That is 
why the Coulomb repulsion ought to be taken into consideration. 
For all that, it usually seems to be weaker than the attraction. 

§ 6.11. An energy-momentum-tension tensor of an electromag- 
netic field in vacuo. A transition to four-dimensional quantities 
combines the quantities whose interrelationship was imperceptible 
in a three-dimensional approach. In the case of a [ree particle ore 
4-vector combines energy and momentum. An electric and a mag- 
netic field constitute an electromagnetic field tensor in 4-space. An 
energy and a momentum of an electromagnetic field turn out to be 
components of a tensor which, apart from an energy (a scalar in 
a three-dimensional case) and a momentum (a three-dimensional 
vector), comprises also a three-dimensional-tension tensor of Max- 
well. Here we shall have to quote the results of the Maxwell theory 
in a three-dimensional form. 

1. The energy conservation law for charges and a field. This law 
follows directly from Maxwell’s equations: multiplying Eq. (6.56a) 
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scalarwise by E and Eq. (6.57a) by H and subtracting the expres- 
sions thus obtained, we get 


H rot E— Erot H = — jE — DE — BH. 


Making use of the following identities H rot E — E rot H=div [EH] 
and (d/dt) (ED + BH) = 2(DE + BH) (the last one is valid for 
an isotropic medium in which D = eE, B = pf), we get 


d ( ED+ BH : 7 
o(P“ )=-iE- div|EH}, 


whence, after integration over an arbitrary volume ¥ and using 
the Gauss-Ostrogradsky theorem, we come to 


= —\jEar—$sas. (6.119) 
rv Ss 


The left-hand side of Eq. (6.119) represents a time variation of 
an electromagnetic field energy in a volume ¥. This energy is 
defined in the Maxwell theory via an energy density (an energy 
per unit of volume): 

= ERt SH (6.120) 


by integrating over a vol‘ime: 
Ww =| war. (6.121) 


- 

Let us consider the simplest case of charges in vacuo. In this 
case j = pv, while the force density acting on these charges, i.e. 
a Lorentz force density, is 


i =p (E+ [0B] = pE + [jB}. (6.122) 


This force is introduced into the theory in order to correlate the 
field theory with the field of force acting on charged objects lo- 
cated in the field. Eq. (6.119) features an expression jE. Multiply- 
ing Eq. (6.122) scalarwise by v, we obtain f‘v = pEv = jE. There- 
fore, one of the terms of the right-hand side represents a work 
performed by a field on a charge in this case. In accordance with 
the energy conservation law this work must turn into a kinetic 
energy of particles 7. Consequently, 


(ear —=\ foar= (6.123) 
Yr 
The second term of the right-hand side of Eq. (6.119) represents 


the Poynting vector 
S =[EH), (6.124) 
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while the integral itself represents a flux of the vector S through 
the surface enclosing the volume ¥. The integrand also includes 
dS = ndS, a product of a surface area dS and a unit vector a 
of its normal. Hence, the energy conservation law for charges and 
a field can be written down as follows: 


i (T+W)=—§ SdS. (6.125) 
5 


The Poynting vector (6.124) is usually interpreted as an energy 
flux per unit time through a unit area oriented normally to the 
Poynting vector. Such an interpretation does not necessarily follow 
from the Maxwell equations. The direct consequence of the Max- 
well equations is the integral relation (6.119) which can be re- 
garded as the energy conservation law. It is clear that any S’ ad- 
dition to the Poynting vector S, satisfying the condition div S’=0, 
does not vary the relation (6.119). The generally accepted inter- 
pretation, however, is confirmed by experiment. 

2. The momentum conservation law for charges and a field. The 
momentum conservation law can be treated as follows. Multiplying 
Eq. (6.56a) vectorwise by B and Eq. (6.57a) by D, and adding 
the equations obtained termwise, we get 


»(H rot H] + e [E rot E] = — [jB] + en [HE] — en [EH ]- 
We have taken into account that D = eE and B = yA; finally: 
p(H rot H) + e[E rot E] = — [jB] — ep EH). (6.126) 
Let us make use of the vector identity 
a div a — [arota] = = (a,a, -5 Saga ) Ms. 


Subtracting the left-hand and right-hand sides of Eq. (6.126) re- 
spectively from the identity 


pH div H + eE divE =pE 
be Eqs. (6.56b) and (6.57b)), we get the final equation 
al Cee pt bag =) my = 
= pE + [iB] + en <7 [EH], 
which can be rewritten in the form 


st m,=f + ; (DB). * (6.127) 
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where the tension tensor of Maxwell is introduced 
To, = e£,E, + pH gH, _ Sap w= E.D, + H,B, _ bas w. (6.128) 


The tensor (6.128), being symmetric in vacuo and isotropic me- 
dia, is asymmetric in anisotropic media where it is defined accord- 
ing to the last equation of (6.128). Integrating Eq. (6.127) over 
an arbitrary volume in the region where an electromagnetic field 
exists, we get 


(F2 m,ar=\par+ | talar. — 6.129) 


Ox 
r r r 





It was again assumed that in Eq. (6.127) we deal with free 
charges in vacuo, subjected to the Lorentz force (6.122). 
According to the second law of Newton 


(rar =F, (6.130) 


where P is the momentum of charges enclosed within the volume ¥. 
The integral with respect to volume entering into the left-hand 
side of Eq. (6.129) transforms into the integral with respect to the 
surface enveloping the volume 7: 


OT, 
| sa mar = § Tagram, dS. (6.131) 
Ss 


OX 
r 


The expressions 
Taptam, dS 


represent a force acting on an infinitesimal surface area dS whose 
normal’s components are ng. The vectors mg are unit vectors 
of the Cartesian coordinate system. We could already write down 
the momentum conservation law if we knew what should be re- 
garded as a momentum of an electromagnetic field in matter. So 
let us confine ourselves to the case of vacuum where [DB] = 
= (1/c?) [EH] = S/c?. Then taking into account Eqs. (6.130) and 
(6.131), Eq. (6.129) can be rewritten in the form 


(P+ G) = § Tagtam, dS, (6.132) 
s 
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having defined a field momentum density g in vacuo as 
g=S/c? (6.133) 


and consequently a field momentum in the volume 7 as 


G=\ ear. 


r 


Equation (6.132) and the definition (6.133) express the 
momentum conservation law. For a complete field, when on the 
boundary surface 7,g=0, we obtain the conservation law 
(d/dt) (P + G)= 0. The tensor 7, is not defined unambiguously: 
the Maxwell equations yield only the integral equation (6.132), and 
if one adds a component of an arbitrary tensor 7G, satisfying the 
condition (OT¢/dxc) = 0, to each component of the vector G, 


, or. 
§ Top nats dS = \ sae mp dv =0, 
v 


a 





and Eq. (6.132) retains its validity all the same. Here we proceed 
in much the same way as we did when selecting an expression for 
the Poynting vector from the energy conservation theorem or find- 
ing an expression for a displacement current density. Our selection 
depends on the correctness of all of its consequences. The tension 
tensor (6.128) in vacuo where D = e9E, B = poH, together with 
the definition of a momentum (Eq. (6.133)), yields reasonable 
physical results. 

In conclusion note that Eq. (6.132) makes it clear that the defi- 
nitions of a momentum density and a tension tensor are closely 
interrelated. Having redefined a definition for a momentum density, 
we modify at once an expression of Tg, (see § 6.12). 

Let us summarize the results that we obtained for the case of 
vacuum: aS a consequence of the Maxwell equations, the momen- 
tum density defined by Eq. (6.133) ought to be assigned to an 
electromagnetic field in vacuo. Then Eq. (6.132) expresses the New- 
ton law: an increment of the total momentum of charges and of a 
field in a volume 7 is equal to the sum of forces acting upon this 
volume. These forces can be written down in the form of surface 
forces, i.e. the forces acting on a surface enveloping the volume 7. 

A transition to four-dimensional terms can be accomplished as 


follows. First, let us prove that a 4-force density f (see Eq. (6.54)) 
can be rewritten as a four-dimensional divergence of a tensor 7.2: 


OT iz 
Ox, * 





1 
h= > Fuse = 
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where T,, is an energy-momentum-tension tensor *; the components 
of this tensor have the following form: 


1 1 
Tie = > Fiml me + Ge Oi (Fal sn). (6.134) 


In the first term of the right-hand side of Eq. (6.134) summa- 
tion is performed over m, and in the second one over s and n; Fin 
and fs, are the corresponding components of tensors (6.29a) and 
(6.31). 

In order to obtain Eq. (6.134) we shall need the Maxwell equa- 
tions expressed as Eqs. (6.60) and (6.67); here we shall rewrite 
them in a more convenient form: 





0 
art = (6.135) 
OF iz OF i; _ OF er 
ax, + Oxe “Ox oe) 


Let us transform the four-dimensional force density: 


h=> Fuse =tFy we al{e (Frefes) — Fee oie (6.137) 





Here we made use of Eq. (6.135) and applied the rule of a product 
differentiation. 

Now we shall deal with the second term in the last link of Eq. 
(6.137): 


fa G5 (fer Gt + fae out) = 
my (fa Gt th Gt) ao fe (Gat + St) = 
= — Fh Et Sf iy Mt [BS hy = 
=—F Gey (leuFon)- (6.138) 




















The following operations are carried out in the equation chain 
(6.138). The transition to the second link is based on antisym- 
metry of the tensors fe: and Fi. permitting of exchanging indices 
in each term of the product without changing the product in the 
process: 








OF OF 
Far t= fie SE. (6.139) 


* We shall denote the four-dimensional tensor (6.134) by the same letter T 
that we used in the case of the three-dimensional tensor (6 128); this should 
not lead to any misunderstanding because, as it will turn out, nine components 
of (6134) for i and & changing from | to 3 coincide with (6.128). 


8* 
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Consequently, instead of one term we take a half-sum of two equal 
expressions given by the left-hand and right-hand sides of Eq. 
(6.139). The third link involves a substitution of mute indices 
which does not change a summation result: the index / is replaced 


: ; OF Ry OF 
by & and vice versa, i.e. la=. is replaced by fy; 3x * The com- 





mon factor fx; is taken out of the brackets in the fourth link, while 
in the fifth link Eq. (6.136) is used. In the sixth link of the equa- 
tion we replaced Fx; in accordance with Eq. (6.58); this operation 
is valid only for vacuum. 

Eq. (6.138), however, remains valid also for a uniform isotrop- 
ic medium. It can be easily shown that the components of tensors 
f and § are proportional as before in such a medium although spa- 
tial and temporal components have different proportionality fac- 
tors. From the general appearance of the tensors §= (cB, —iE) 
and f=(H, —icD) it is clear that the spatial components tie to- 
gether the vectors B and H, while the temporal ones the vectors D 
and E. In order to get the necessary relations D = eE and B= 
== nH, one has to assume 


fag =F ap, =? = ye’ (6.140) 


feo=bF yy b=—aA/ eee. (6.141) 


&0 Ho 


But then, starting from the fifth link of Eq. (6.138), the subsequent 
chain of equations will be rewritten as follows: 


1 OF g; 1 Ofap 1 Ofes a~' @ 

By incl anes 7 Lar PRAY Te Vp Se 

Bat Ge = TO beg Se, Hy oes See = Gay (Fag)? + 
1a 19a 


BE Ts (Fea? = 4 Ox, (fagP ag) + Ox, (FraPes) = 
=F Gey (fee) =F Go onPen)- (6.142) 


The last link of Eq. (6.138) and the third link of Eq. (6.142) are 
written according to the rule of product differentiation: 


Ofe: 1 0 


fet Ge, = Ga, (Fea): 


Since the factors a and 6 in Eqs. (6.140) and (6.141) are con- 
stant, they can be brought under the differential symbol. And 
finally, for the sake of convenience some other mute indices are 
introduced in summation; in so doing, we are not changing the 
sum friFet = fsaF sn. Therefore, the results obtained for vacuum 
and a uniform isotropic medium prove to be the same. 
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Of course, this result is formally obvious because in the SI 
system vacuum is just one of uniform and isotropic media as long 
as only the relations D = eE and B = wH are essential. 

The first term of Eq. (6.137) can be combined with the second 
term of Eq. (6.138) or Eq. (6.142), given in a final form, provided 
both terms are differentiated with respect to the same variables. 
However, a differentiation with respect to another variable can be 
performed by means of the Kronecker delta 


(7) (7) 
=—— =b,,——. 
Ox, Ox, 


Now we can write down the expression for f; in a complete form 
‘(see Eq. (6.137)): 


b= { aq (Push + a Soe anFon) b= 


*i 
1 a 
HL { Puhr + $8 (fonPon) f- (6-143) 


Let us substitute the mute summation indices in the first summand: 
the index m will replace k, and the index & will replace / in the 
first and second summands. Then we finally obtain 


f= sae {FF imfma + Ze Ou FonFen) $=, (6.144) 


where 7 is defined according to Eq. (6.134). 


Thus, the components of a 4-force { can be expressed through 
the components of a tensor 7ie depending on the field vectors E 
and B. Recall that the components of tensors § and f are propor- 
tional in vacuo according to Eq. (6.58), the proportionality coeffi- 
cient being the same. 

Owing to this circumstance and the definition of the tensor Tix 
(Eq. (6.134)), one can see it is symmetric in vacuo, i.e. Tin = Tri. 
It implies that this tensor has ten independent components. In the 
case of matter a 4-tensor loses its symmetric properties. 

Let us find now the Ti. components expressed through electro- 
magnetic field vectors. Consider first the expression fsnF sn. This is 
precisely the sum of pairwise products of the respective compo- 
nents of matrices (6.29a) and (6.31). The requisite components 
are immediately seen from the definition of tensors f(H, —icD) 
and §(cB, —iE). Designating the coefficient in 6: as A, we find 





1 1 BH DE 
A = Fel enPsn = Gz * 2(CBH — cDE)=-3- —- =. (6.145) 


The digit 2 appeares in front of the parenthesis because due to 
antisymmetric properties of f and F the product of pairwise com- 
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ponents yields the same expression c(BH — DE) twice. Now the 
equation of 7; components can be rewritten as follows (—fam are 
substituted for fme): 


Tin = —~ Fimbim + Sind. (6.146) 


Now let us consider the individual.components. For example, 
we shall find Ti: 


T= —+Finfim +A = — LF fu —+ Fufe— +Fishis— 
— LF ufut A =— 4 (cB,H, +cB,H, —cE,D,) + — 2 = 
=— BH+H,B,+£,D,+ 3-22 = 
=H,B, + E,D,—7ZtS® —,B,4.£,D,—w. (6.147) 


We have found 7,,; to be a component of a three-dimensional 
tension tensor of Maxwell (6.128). Similarly, it can be shown that 
all components of the tensor 7,,, that is the components whose 
indices i, & take on the values from |! to 3, coincide with a tension 
tensor of Maxwell (6.128). It remains to consider the Ti. compo- 
nents in which at least one of the indices is equal to 4. We shall 
start from 744: 


| 
T= — > Fanlam + A= 
DE BH 


B 
= E,D,+ EyD, + ED, +> — Fe et ew. 6.148) 


The component 74, turned out to be equal to the electromagnetic 
field energy density. Now let us find 74: 


1 
Ty =Ty = — = Finfim = —s (Fife t+ Fisfas) = 
= — ic(D,B, — D,B,) = — iceoty (EH), = 
=—+S,=—ic5p=—icg, (6.149) 


In much the same way 
T= T= — icg,, 
Ta = T= — icg,. iG:199) 
The components 714, Tos, T34 turned out to be proportional to the 
components of the electromagnetic field momentum density g = 
= S/c*. Later on it will be clear (see Eq. (6.153)) that we deal 
here with a momentum density indeed, and not an energy flux to 
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which a momentum is proportional. Write out the matrix of the 
energy-momentum-tension tensor for a case of an electromagnetic 
field in vacuo: 


Th Tie T13 — icgy 
T2 T 2 Tx = icgy Tog — icg 
T=) Ty T 32 Ty = icg, =(_?, w ) 
. . . c 
—=S, —<S, -+S, w 
(6.151) 


The upper left square comprising nine quantities defines a tension 
tensor of Maxwell. It becomes a correct quantity in a relativistic 
case when bordered with energy quantities S and w. Let us make 
sure that having composed the tensor 7,2, we obtained the energy 
and momentum conservation laws expressed in a three-dimensional 
form by Eqs. (6.125) and (6.132). Consider now the spatial com- 
ponents of a 4-force: 


(6.152) 


We took account of a three-dimensional momentum of an electro- 
magnetic field in vacuo having the components g, = S,/c?. Multi- 
plying each component f, (a =1, 2, 3) by its respective unit 
vector m, (a= 1, 2, 3) and summing up the values thus ob- 
tained, we get (f = /f,m, is a three-dimensional Lorentz force) 


og OT, 
f+s= aa Mtg. (6.153) 


Integrating the identity (6.153) over an arbitrary volume, we 
get 





a aT, 
Jrav+ | av —( 52 mar. (6.154) 
a v ? B 


The left-hand side of Eq. (6 154) features a variation of a total mo- 
mentum of particles as well as that of a total momentum of a 
field: 

aP 
“ad 


\far= \far= 3 \ear=S. (6.155) 


Applying the Gauss-Ostrogradsky theorem to the right-hand side 
of Eq. (6.154), we get 





aT, 
\ = md? = § Tapftgm, dS =e Taptam,dS; (6.156) 
? 
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the last transition makes use of the symmetry of the tensor 7 gg. 
Now then, we have arrived at the momentum conservation law 
(Eq. (6.132)) and have made sure that the components 714, To4, 
Ts, are indeed proportional to electromagnetic field momentum 
components. The expression 7,gn,mg can be considered not only 
as a force acting on a surface element, but also as a momentum 
flux through that surface element. The quantity 7T,gmt, yields a 
vector component of this flux. Surely, both these interpretations are 





equivalent. 
Consider now f,. On the one hand, according to Eq. (6.54) 
f= 2 (0B) =~ (fo), (6.157) 
and on the other hand 
= OT gk ( OT 4) OT 43 OT 43 OT 4 ee. . Ow 
= Ox, Ox, By Ox, e Ox, - Ox, tt div S + @ (ict) ° 
(6.158) 


Consequently, Eq. (6.158) can be rewritten as 
2 +divS + (of)=0. (6.159) 


Integrating Eq. (6.159) with respect to an arbitrary volume of a 
field and taking account of Eqs. (6.121) and (6.123), we get 


F(T+W)=—$sas, (6.160) 
s 


with the Gauss theorem being applied to the term divS of Eq. 
(6.159). This is precisely the energy conservation law (Eq. (6.125)). 

Thus, in the relativistic theory the Maxwellian tensions, momen- 
tum and energy of a field in vacuo amalgamate into a single tensor 
quantity, the energy-momentum-tension tensor. The energy and 
momentum conservation laws manifest themselves via a single re- 
lation. 

The symmetry of the energy-momentum-tension tensor consti- 
tutes a fundamentally important property. Owing to this, the fun- 
damental relationship between the energy and momentum flux 
densities follows immediately for the case of an electromagnetic 
field in vacuo: 

S= ge’. (6.161) 


One can readily make sure that the spur of the tensor 7y., i.e. 
the sum of its diagonal components, is equal to zero. 

Having established a tensor character of tensions, of a momen- 
tum, and of an electromagnetic field energy flux and density, we 
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automatically obtain the rules according to which these quantities 
are transformed on transition from one inertial frame of reference 
to another. We shall write out only those transformation formulae 
that will be needed later. Substituting the component values from 
Eq. (6.151) into the general equations (A.I. 31), we get 


Tre=Tu=I?(Te-2-4S;-Hw’), (6.162) 
Tu=t?(w' +245,—-H7..), (6.163) 
Ty=Te=P (Ty—FS;). (6.164) 
g=P{(1+4)e¢+0-Zr,}, (165) 
g,=0(¢,-4T%,). (6.166) 
g,=0(¢,--47%,)- (6.167) 


§ 6.12. An energy-momentum-tension tensor of an electromag- 
netic field in a medium. The Minkowski tensor and Abraham 
tensor, An energy-momentum-tension tensor (EMT) in a medium 
attracts interest primarily because this tensor is associated with a 
momentum of an electromagnetic field in a medium. The latter 
quantity is directly related to the quantities observed in an ex- 
periment, e. g. to the pressure of light. This tensor, however, is not 
defined in a unique manner in a medium, and the discussion about 
its “proper form” is still going on. 

Let us find the general form of an energy-momentum tensor in 
a uniform isotropic medium. It was shown in § 6.11 that the 
general form of energy-momentum tensor components in a uniform 
isotropic medium does not differ from that in the case of vacuum; 
in both cases (see Eq. (6.146)) 


Tik= — + Fimfem + Sind. (6.168) 


However, the proportionality factors between spatial and tem- 
poral compenents f and § are different in a medium (see Eqs. 
(6 140) and (6.141)), and the tensor 7,, defined according to Eq. 
(6.168) turns out to be asymmetric in contrast to the tensor (6.149). 
Asymmetry arises due to tempor2! components of the tensor; spa- 
tial components are symmetric, at least in an isotropic medium. 
It is indeed easy to see that spatial components of the tensor Tix 
in a medium differ from those in vacuo only by the values of s 
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and yp. So, for example 


Tu=—+Fimfim + A= 
=—+(cB, = = +68 yb — eet) +A = 
= — H,B, — HB, — H,B, + H,Be+ E,D, + POPE = 
=H,B,+£,D,— 72+ 9e, 


This expression coincides with the value of 7\, given by Eq. 
(6.147), the only difference being the quantities e and p replacing 
the quantities e9 and po of Eq. (6.147). Exactly in the same way 
we shall find that 

ED+ BH 


1 
T= — > Famfam + A =—— = 8 


However, if 


T\,= — icep [EH], = — (i/c) (ep/eouo) Sx, 
then 


Ta, = — (i/c) [EH], = — (i/c) S,. 


Consequently, the energy-momentum-tension tensor in a uniform 
isotropic medium, obtained by a direct transformation of a 4-force, 
is not symmetric any more. It is called the Minkowski tensor, and 
its components have the form 


Ty T 12 T13 — (i/c) n’S, 
M To Tx Tx — (i/c) n’S, 
Tik= Ta Te Tx — (ic) nS, | 
—(i/c)S, —(i/e)S, —(i/c)S, w 
Tus  — (ile) n?S 
= ( — (ile) S ). (6.169) 


Eq. (6.169) includes a refraction index n= ~/ep/eoty. The momen- 
tum denety corresponding to tensor (6.169) (i.e. the components 
Tia, Tet, T34) turns out to be ae to 


The superscript “M” appearing in a momentum density symbol 
points to the fact that this density corresponds to the Minkowski 
tensor. The momentum density of a field in a mediuin (see Eq. 
(6.170)) exceeds n? times that in vacuo (Eq. (6.133)). 
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It 19 often suggested to use Eq. (6.133) for the description of 
the momentum density of a field in a medium as well. Surely, this 
way we subdivide a total momentum into a momentum of a field 
and a momentum of a medium itself. However, a momentum den- 
sity of a field can be separated in the form of Eq. (6.133) only if 
another energy-momentum tensor, different from the Minkowski 
tensor, is utilized. It is also important to keep Eq. (6.133) because 
it yields the most general formulation of the energy inertia law. 
Since an energy flux is described by the components 7,4, of an 
energy-momentum tensor and a momentum density by the compo- 
nents 7,4, Eq. (6.133) indicates the tensor’s symmetry. Therefore, 
we should construct a new symmetric tensor satisfying the follow- 
ing conditions: Tg, = —icg, and T., = —(i/c) S,; Eq. (6.133) 
should be satisfied as well. The three-dimensional tension tensor 
T,p should coincide with the three-dimensional tension tensor of 
Maxwell (6.128). Such a tensor was suggested by Abraham in the 
following form: " 

A Tog — icg ) A S 

Th=(_ Gis w , g =i > 23° (6.171) 

Since Tg, = Tpq and owing to Eq. (6.133), this tensor is sym- 

metric. the introduction of the Abraham tensor implies an ap- 

earance of a volume force acting on a medium. This force is re- 

erred to as the Abraham force. To find its magnitude, recall that 

the Lorentz force density components are related to the Minkowski 
energy-momentum tensor by the ratio 








ore 
i= ax, (6.172) 
This is precisely how the Minkowski tensor was obtained. 
It follows directly from Eq. (6.172) that 
5 Bt 
im, =ft= ox, Mla: (6.173) 
& 


Writing out the sum on the right-hand side and interchanging 
the terms of the equation, we get (see Eq. (6.169) ) 
or, ag™ 
Ox, mM, — 3 =F, 





(6.174) 


where f' denotes the Lorentz force density. 
Now let us write the expression (OT3,/dx,)m, in full, taking into 


account Eq. (6.171): 





ara, OT op og* oT a ogM =a 
=> — a = — (gM — gA 
i ame) ec eT a + oy (e" — 8”). 


(6.175) 
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The second link of Eq. (6.175) takes into account that the tension 
tensor of Maxwell is the same both in the Minkowski and in the 
Abraham tensor; the third link of Eq. (6.175) is the identical trans- 
cription of the second link. But the first two terms of the last link 
of Eq. (6.175) can now be substituted according to Eq. (6.174) to 
yield 


oT ag* . 
Ga, Maar +P (6.176) 


The right-hand side of Eq. (6.176) contains a term f4 representing 
the derivative of a momentum density; according to the second law 
of Newton the derivative of a momentum density with respect to 
time makes up a force density 


0 ts) | 
h=3 e'—2) —3{[DB]—r[EH]}. (6.177) 
In an isotropic medium 


2 

P= (eu — en) S=s(+>1 8). (6.178) 

A force density given by Eqs. (6.177) and (6.178) is referred to 
as the Abraham Iorce density. 

The Min':owski tensor and the Abraham tensor furnish different 
expressions for an electromagnetic field momentum density. Let us 
write out the corresponding expressions for a plane electromag: 
netic wave momentum density. In the case of a plane electromag- 
netic wave propagating in a uniform isotropic dielectric the rela- 
tionship between the value of the Poynting vector S, the mono- 
chromatic wave phase velocity v and the wave energy density w 
is given by the simple equation: 


S=w-u, (6.179) 
v=c/ra= (ep/eop)". (6.180) 
Eqs. (6.170) and (6.171) furnish the momentum densities: 
gM = (n?/c?) S = (n?/c?) w (c/n) = (w/c) n, (6.181) 
gh = (S/c?) = (1/c?)w (c/n) = (w/cn). (6.182) 


Let us assume that electromagnetic field energy can be quan- 
tized, i.e. w = Nho, where N is the number of quanta in a unit of 
volume. Then from Eqs. (6.181) and (6.182) we obtain the fol- 
lowing quantum momenta in a medium: 

p™ = (Aa/c) n, (6.183) 
p* = (ho/cn). (6.184) 
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Which ol these two equations is “true”? The tecknique of secondary 
quantization of an electromagnetic field in matter results in Eq. 
(6.183). Let us assume that a quantum momentum in a medium p 
is defined as follows: 


p= (Ao/c) ns, (6.185) 


where s is a unit vector in the direction of wave propagation, 
and a 4-vector of a quantum energy-momentum takes the form 


p(*2ns, *°). (6.186) 


¢c 


Using Eq. (6.186), one can obtain a correct expression stipulat- 
ing the Vavilov-Cherenkov radiation (see Chapter 7). This circum- 
stance seems to indicate that Eq. (6.186) and the Minkowski tensor 
are to be preferred. However, a meticulous usage of any of these 
tensors provides a correct result. The point is that both tensors 
satisfy Eq. (6.127) which is the consequence of the field equations. 
It is important to know how an electromagnetic field momentum 
is defined in matter. In'a general case, it is impossible to split the 
total momentum of a field into a fraction pertaining only to matter 
and that pertaining only to a field. But this is precisely what it is 
done when the Abraham momentum is introduced. When a light 
wave passes from vacuum into a medium, its momentum is not 
totally transferred into this medium; a fraction of the momentum 
is transmitted to the medium itself. The Minkowski tensor is uti- 
lized whenever a total transmitted momentum is considered; if we 
deal with a momentum related to radiation in a medium, the Ab- 
raham tensor should be used. Eq. (6.185) provides a correct value 
for a photon momentum in the case of the Vavilov-Cherenkov ra- 
diation (see Chapter 7) because an overall momentum which a 
Cherenkov electron transmits to a medium is of prime importance 
here. The overall momentum transmitted to a photon in a medium 
is just equal to Awn/c. No wonder that quantization of an electro- 
magnetic field in dielectrics results in Eq. (6.185) describing a 
momentum. This expression represents an overall momentum of 
an electromagnetic field, this momentum being associated both with 
a field and with a matter (see § 7.7). 

As to a force acting on a matter, it is related to the Abraham 
force and, naturally, to the Abraham tensor. An attempt to measure 
ee force was undertaken in 1975 which apparently suc- 
ceeded. 

Fig. 6.7 illustrates the experimental arrangement. A disc made 
of barium titanate (e ~ 4000, p = po) has a small hole in the 
centre. Both rims of the disc are coated with aluminium; conse- 
quently, it forms a cylindrical capacitor. The disc is so suspended 
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on a long tungsten filament that it can perform torsional oscilla- 
tions between the poles of a dc electromagnet generating a 10 kGs 
field. An ac voltage with a 150 V peak value is applied to the 1n- 
ternal rim electrode, while the outer electrode is grounded by 
means of a thin gold wire not affecting the disc’s oscillations. The 
voltage is applied in phase with 
Suspension and ac voltage the characteristic oscillations of 
Z the disc. 

The Abraham force can also be 

written in the form 






os 
fA = equg (%m%e — 1) 7 


A tungsten = Coto (%m%e — 1)[EH], (6.187) 
a pal a where 


H%e=2/€y, %m= H/o (6.188) 
and the fact that the magnetic field 
is constant (H =0) is taken 
= into accouht. In the case of ba- 

Grounding rium titanate xm = 1 and conse- 
quently 
Fig. 6.7. An experimental observa- f* =[e0 (xe — 1) B, poll] = 


tion of the Abraham force. ; 
=([P, poll]; (6.189) 


the last equation follows from the fact that in a uniform isotropic 
medium 


35cm 


P=D — GE =(e — &) E=& (x, — 1)E. (6.190) 


The physical meaning of the “Abraham force” is obvious in the 
specific case just considered. P is the fraction of the displacement 
current caused by the motion of bound charges. In essence, this 
is the Ampere force. It is not difficult to find out that this force 
induces a torque, the electric field being directed radially. Surely, 
the essential point is that no other forces associated with the pres- 
ence of an electromagnetic field contribute to the torque. 

The authors who proposed this experiment claim that the ob- 
served oscillations of the disc are consistent with the calculation 
based on the presence of the Abraham force. We should point out 
here again that this experimental result, however interesting per 
se, by no means “chooses” between the tensors (6.169) and (6.171). 
Some additional remarks concerning a choice of an expression for 
a photon momentum in a medium can be found in § 7.7. 

§ 6.13. An energy-momentum-tension tensor of a_ spherically 
symmetric charge. If an electric charge in vacuo is at rest in the 
frame K, there is only an electric field in the frame and the energy- 
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momentum-tension tensor can readily be written in the form 


Ti 4 
Gals (6.191) 


where w =e E7/2 and Tap = eo£,Fp — bapw. If the charge 

moves relative to the frame K’ at the velocity —V, its tensor Tix 

can be found through the use of the general equations for the 

tensor component transformation. Specifically, in order to trans- 

form the momentum density along the x axis, that is (i/c) T\4, and 

we energy density 7 ,, we have the following equations (see 
31): 


T= 


Tig = — (BIT, + PBT, = iBl? (ww — T))), 
, v2 
Ta = PT 4 — BIT, = 0? (w— Tn). 
Let us find the total energy and total momentum of a point 


charge, taking into account that a transition from a volume ele- 
ment dy” in the frame K’ to a volume element dy in the frame K 


is effected by means of the formula d/”’ = ar: 


U' =| Ted = \Tapar=C | (w—Bn) dr, (6.192) 
i 
c 


a=t | rar stl i par=Zl(tu— war. 6.193) 


c c 
It is obvious that 


\ war =| ar su. 


If in the charge’s proper frame of reference K it possesses a sphe- 
rical symmetry, so that Ey = EF; = E; = E’/3, then 


(Tar =e | (e-+)ar=—2 | ear=— 4. 


Consequently, Eqs. (6.192) and (6.193) take the form 


u’'=Tu(14+4), (0.194) 
G=—f+Tu. (6.195) 

The momentum components Gj, and G2 turn into zero, so that 
@’=—44 ru. (6.196) 


The appearance of the “minus” sign in the last equation is due to 
the fact that the charge moves at the velocity —V relative to the 
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frame K’. Comparing Eqs. (6.194) and (6.195) with the equations 
transforming a momentum and energy of a particle on transition 
from the proper frame of reference to an arbitrary one (see Eq. 
(5.49) ), we see that these equations are different. In former times 
an effort was made to treat an electron mass as an electromagnetic 
one using the relation 


m=U/e. ’ (6.197) 


Eq. (6.196) shows that such an interpretation gets into trouble 
since to “confine” a charge some additional forces are required 
neutralizing repulsion, i.e. the additional energy that was not ac- 
counted for. Having taken into account mechanical stresses, we 
can obtain the following relations: 


o’=—ITv4, u’'=ru (6.198) 


in a complete agreement with Eq. (5.49). The details can be seen 
in [13]. 

§ 6.14. The field potentials in a moving non-conducting me- 
dium.* In § 6.1 we introduced a 4-potential of an electromagnetic 
field in vacuo. Surely, an electromagnetic field can be determined 
immediately from *:e Maxwell equations, without resorting to po- 
tentials. In many cases, however, the utilization of potentials as 
intermediate qua':tities determining the fields E and B, proves to 
be very convenient if only because the number of functions to be 
determined decreases. Knowing only four components of a vector 
potential, one can find from them all components of an electric 
and a magnetic field. The potentials are still more convenient to 
use in electrodynamics of moving media where the material equa- 
tions (6.74) and (6.75) turn out to be much more complicated than 
in the case of a stationary medium. 

It will be shown below how one can obtain the expressions for 
field potentials in a moving medium. To illustrate an application 
of such potentials, we shall consider the propagation of a plane 
electromagnetic wave in a medium moving relative to a stationary 
observer. This example has a direct bearing on problems analysed 
in Chapter 7. In this section we apply the methods of tensor al- 
gebra described briefly in Appendix I, § 3. 

Now let us derive equations for a 4-potential in moving media. 
We shall be describing a field in a moving medium by two tensors: 
Fi, (see Eq. (6.29)) and fiz (see Eq. (6.31)). The tensor F,, is 
sometimes referred to as a field tensor, and the tensor fi, as an in- 
duction tensor. 


* §§ 6.14 and 615 are written by B M Bolotovsky and S. N. Stolyarov, 
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Let us introduce a four-dimensional field potential in a medium, 
@, having defined it by the following relation 


_ a®, ag; 
Fu=e(F —#): (6.199) 





coinciding with Eq. (6.28). Knowing the four components of the 
potential ®,, we can use this equation to determine all components 
of the tensor Fi, that is the magnetic induction B and the electric 
field E. In order to describe wholly an electromagnetic field in a 
medium, one needs to know also the components of the tensor f.:, 
f.e. the components of the vectors of the magnetic field and electric 
induction. If the field tensor § is known, the induction tensor f can 
be determined by means of the material equations (6.79) and 
(6.80) relating the components of these two tensors. Recall! that 
the correlation between the tensors § and f in a vector form is 
given by the Minkowski equations (6.74) and (6.75). 

The material equations (6.79) and (6.80) defining the relation- 
ship between the tensors § and f can be written in the form of a 
single tensor relation 


fie =ttimF tm (6.200) 


where the tensor of the fourth rank eseim is so chosen as to make 
the Minkowski equations (6.74) and (6.75) valid. It is easy to show 
that the tensor of the form 


etkim = ae (B11 — xC~*U U1) (Bem — xC-*U Um) (6.201) 


possesses the necessary properties. Here 6, is the Kronecker delta 
defined by Eq. (A.I.4), and Ue are the four-dimensional velocity 
components. This four-dimensional velocity was discussed in 


Chapter 5 and shown to have the components V(rV, icT) where V 
is the three-dimensional velocity of a medium motion. The dimen- 
sionless constant x is defined via the refraction index n: 








nah pant t, natal. 6.202) 
uv €oHo 


Colo 


It is easy to see that x = 0 in vacuo, and the relation (6.200) 
between the tensors f and § takes the form 


1 
f= Te Pe (6.203) 


corresponding to the well-known relations between the fields E 
and H, and the inductions D and B in vacuo: 


D=eE, B= pH. (6.204) 
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In a Stationary medium the tensor e.zim of Eq. (6.201) yields the 
following relations between the fields and inductions: 


D=cE, B=wpH. (6.205) 


This is easy to ascertain, having assumed U,; = U, = U3; = 0 
and U, = ic in Eq. (6.201). 

Since the components of the tensor. Fim are expressed via the 
components of the four-dimensional potential @,, and the compo- 
nents of the induction tensor fie are related to Fim by Eq. (6.200), 
the components of the tensor f,, can also be expressed through the 
components of the four-dimensional potential @,. In other words, 
to determine all components of fields and inductions in moving 
media, it proves to be sufficient to know the four functions @,. 

Now let us derive the equations for field potentials in moving 
media. For this purpose let us make use of Eq. (6.60): 

Obie 


=—=s, 


Ox, 


Substituting fi. from Eq. (6.200), we get 





OF 
fikim os =. (6.206) 


Utilizing Eq. (6.201) which gives an explicit expression for the 
tensor e,nim, aS well as Eq. (6.28) expressing Fim through the po- 
tential components @,, we reduce Eq. (6.206) to the following 
form by simple transformations: 


ae bu — Kem *U Wt a alae — xc~?U,Un | nF. 


= Fa — xc? (Us a) |= \ =s, (6.207) 


Now multiplying both sides of ba (6.207) by the tensor 
(810 + —— C~ 2U,U, ) 


and making use of the Sees 

















(Bu —xe*UU) (ba + ppg CPU a) =a (6.208) 
which iis easy to check, we finally obtain 
[ar *(Uae) |%- Li sla x0—7U U m ie a 
Ox, Ox, 





=— pec G Sa 7 c7?U Ua) St. (6.209) 
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The system of equations (6.209) defines all components of the 
potential @, from the given field sources s; in a moving medium. 
This system can be simplified provided a well chosen additional 
condition is imposed on the potentials, e.g. it is required that the 
following relation is to be valid: 
OO, a -2 
Ox, xc pi ae a 
This condition is a generalization of the well-known Lorentz con- 
dition imposed on potentials in vacuo (see Eq. (6.8)). The feasibi- 
lity of Eq. (6.210) is demonstrated in the same way as in conven- 
tional electrodynamics. 
When condition (6.210) is satisfied, Eq. (6.209) becomes simpli- 
fied and takes the form 
c-?U,U, )\s;. 


3? 0 \2 
{sq 7? (GE) \ a= — Be (80+ 
kR 
(6.211) 


The system (6.211) is more convenient as compared to the sys- 
tem (6.209) since it comprises four equations, each of which in- 
cludes only one component of a vector potential (a = 1, 2, 3, 4). 
For given external sources, the solution of the system (6.211) fully 
defines the field generated by these sources in a moving medium. 

If a moving medium has an interface, the system (6.211) should 
be supplemented by requisite boundary conditions (see § 6.8). 

As an example of solving the equations obtained, let us consider 
an electromagnetic field in a moving medium in the absence of 
external sources, both currents and charges. Since in this case all 
s, = 0, the system (6.211) turns into a system of the four uniform 
equations: 





=0, (6.210) 





Fe p—8 N VOe (6.212) 
{ Ox : ( . Ox, e. ° : 

Due to the ailaitiane’ condition (6.210) only three out of the four 
quantities ®, are independent. Accordingly, we can assume ®,=0, 
and treat the remaining three quantities ®,, M2, My as components 
of some tensor which we shall denote by A. Thus we see that in 
the case of such a calibration the vector potential ©, for a moving 
medium is a three-dimensional vector potential A. 

In this case the system of equations (6.212) can be rewritten 
in terms of the ae A: 


n? — 
{(a- = xan — say [wv ao val \A=0, (6.213) 
where B = V/c, under the additional condition 


diva — oe [V9 + F]VA=0, (6.214) 
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which tollows from the additional condition (6.210) at @, = 0. lf 
we know how to solve Eq. (6.213) with respect to the potential A, 
the fields E and B can be expressed through A according to Ey. 
(6.28) which in our case takes a simple form 


B=rotA, E=— 24, (6.215) 

Knowing E and B, we can find D and H by means of the Min- 
howski equations (6.74) and (6.75) for a moving medium. 

Eq. (6.213) describes propagation of free electromagnetic waves 
in a moving medium. Free electromagnetic waves usually imply 
a field in the absence of charges and currents. Now let us pass 
over to the solution of this equation. We shall be seeking the solu- 
tion for the vector potential A in the form of a plane electromag- 


netic wave: 
A= Age! t-4n), (6.216) 
Substituting this expression into Eq. (6.213), we obtain 


{(—# 4+) + Ree WV — oF} Ato 0, 6.217) 


Eq. (6.217) shows that the plane wave amplitude Ao differs from 
zero only for those waves that satisfy the condition 


2 2 1 
(—# +5) + aq apy (kV — 0) =0. (6.218) 
Eq. (6.218) can be readily derived from the dispersion equation 


which is valid for plane monochromatic waves in a stationary me- 
dium: 


k? — @?/u? =0, &k? =R 
We shall rewrite it in the form 


2 2 
(e@-S - 5 1 @=0. 





Here the parentheses enclose the square of the four-dimensional 
wave vector in vacuo k (&, i), the quantity which is invariant 
relative to the Lorentz transformation. The quantity enclosed in 
the parentheses retains its appearance and numerical value in all 
inertial frames of reference. The second addendum of the last equa- 
tion transforms as the frequency o. In the reference frame in which 
a medium moves at the velocity V, @ should be replaced by 


o = (o— RV) V1 — V3c? 
(see § 7.2). 
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Due to these considerations the dispersion equation takes the 

form (6.218) 
w? n— 
WG — aq @— bv =0 
in the reference frame relative to which the medium moves at the 
velocity V. This condition defines a relationship between the wave 
vector & and frequency of a plane electromagnetic wave propa- 
gating in a moving medium. The additional condition (6.214) for 
such a wave takes the form 
2 
(Ao &+ Vip (@ — kV) =0. (6.219) 

From Eq. (6.219) requiring that the scalar product turn into 
zero, it follows that in a moving medium the vector Ap is perpen- 
dicular not to the wave propagation direction defined by the wave 
vector k, but to a linear combination of the wave vector k& and the 
velocity vector V of the medium. In the two specific cases, when 
a wave propagates in vacuo (n = 1) and when a medium is sta- 
tionary (V = 0), Eq. (6.219) turns into the well-known relation of 
free transverse electromagnetic waves: Apk = 0 from which it fol- 
lows that in a free electromagnetic wave the vectors E, H, B and D 
are perpendicular to the wave vector, that is to the wave propaga- 
tion direction. In a moving medium, however, the waves are noi 
iransverse, generally speaking. Indeed, in the case of a plane wave 
(Eq. (6.216)) the fields E and B are determined according to Eq. 
(6.215): 

B= — i [RA] e!@t-*0, EB = — iwAge! t-an, 


Whence it is seen that the vector B is perpendicular to the wave 
vector & while the vector E is not (since the vector Ay is not trans- 
verse according to the condition (6.219)). 

Eq. (6.217) relating the wave vector & and frequency ow of a 
wave in a moving medium includes the scalar product RV. This 
means that the wave propagation conditions depend on the angle 
between the propagation direction, or the wave vector k, and the 
velocity of a medium V. This circumstance indicates the pheno- 
menon of carrying away of light by a moving medium. Let us 
consider this phenomenon in detail in the case of small velocities. 
Since the quantity B = V/c is small, we shall ignore all values 
of B having degrees higher than one in Eq. (6.128). We shall get 

2 2 
S — 4+ > [w*— 20(kV)] =0, 
or 


o? — 2aV (1 —*+)o-E=0. 
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Solving the obtained quadratic equation with respect to w in the 
same approximation, we get 


k 1 
o=t Te +av (i =i) (6.220) 

From the two signs in front of the first addendum on the right- 
hand side one must choose the plus sign since in the case of 
V = 0 we have to get the well-known relationship between w and 
k in a stationary medium. 


Oot (6.221) 


here we introduced the refraction index of a stationary medium n. 
The quantity c/n is the phase velocity of light in a stationary me- 
dium. 

The angle between the vectors & and V being denoted by 0, Eq. 
(6.220) takes the following form 


o c l 
=£+V cose (1-4) (6.222) 


for the indicated choice of a sign. 

Just as in the case of a stationary medium (see Eq. (6.221)), the 
quantity w/k in Eq. (6.222) defines the phase velocity of light but 
this time in a moving isotropic medium. Comparing Eqs. (6.222) 
and (6.221), we see that the phase velocity of light in a moving 
medium is different in different directions. If light propagates 
along the motion of a medium (cos@ = 1), the phase velocity 
is equal to 


gait(i-d)y 


If light propagates against the motion of a medium (cos® = 
= —1), then 

@ c l 

Zura). 
The factor (1 — 1/n?) is the so-called light drag coefficient which 
was experimentally measured by Fizeau with water serving as a 
moving medium. 

§ 6.15. The field potentials in a moving conducting medium. 
Prior to dealing with field equations in a moving conducting me- 
dium we shall recall the main facts concerning the propagation of 
waves through a stationary medium in the presence of conduct- 
ance. 

In this case the Maxwell equations take the form 

rotE=—B, rotH=D+j+cE, 


div D=p, div B=0. (6.223) 
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Here p and j are the charge density and current density induced by 
“extraneous” sources. Subsequently we shall confine ourselves to 
consideration of equations (6.223) in the absence of “extraneous” 
sources, i.e. assume p =O and jf = 0. The solution of equations 
(6.223) will be assumed in the form 


E = Eje!(#r-9), DD = Dye! tr—00), 


H= Hye! *r-28), B= Bye! (tr—ot), (6.224) 


where Eo, Do, Ho, Bo are constant amplitudes that do not depend 
on either coordinates or time. Thus we seek the solution as a plane 
wave with the wave vector & and frequency wo. 

Substituting the expressions for fields (6.224) into equations 
(6.223) and taking into account that p = 0 and j = 0, we obtain 
the following algebraic, not differential, equations interrelating 
amplitudes of fields: 


[k, Ep] =@Bo, [k, Ho] = — oDy — ioE,, 
(RD,) = 0, (RBo) = 0. 


We have made use of the following equations: 
rot (Ee'*") = i[k, E,je'”, 
div (Eoe!*’) = i (RE) e'*”. 
Eqs. (6.225) should be supplemented with material equations 
defining the relationship between field and induction values. In the 


simplest case of an isotropic stationary medium we assume the 
following relations: 


(6.225) 


D)o=e&, Bo= ph. (6.226) 


We shall suppose that the quantities e and p do not depend on 
field amplitudes. In so doing, we ensure a linear relationship be- 
tween the fields Eo, Hp and inductions Do, By respectively. However, 
e and p may, generally speaking, depend not only on field valucs 
but also on the frequency » and wavelength A = 2n/k. When e 
and p depend only on frequency, we observe a frequency disper- 
sion. But if e and » depend on a wavelength, the medium is said 
to possess a spatial dispersion. 

If the material equations (6.226) are substituted into Eqs. 
(6.225), one gets the equations involving only the field amplitudes 
E> and yo: 

[&, Eo] = poHo, [kHo] = — ewEy — top, 
e (RE,) = 0, p (RH,) =0. 


Hereinafter we shall assume e and p not to turn into zero. Then 
from the last two equations it follows that the fields Ey and Ho 


(6.227) 


243 Spectal Theory of Relativity 





are perpendicular to the wave vector &. Such waves are referred 
to as transverse waves. We shall not discuss the longitudinal 
waves arising when, for example, e = 0. (Then it is easy to see 
that E> I R). 
Multiplying vectorwise the first equation of (6.227) by the wave 
vector Rk, we obtain 
[R [RE o]] = po{RkH]. (6.228) 


Now the value of [RH] can be taken from the second equation of 
(6.227), and the double vector product is written out with the help 
of the well-known formula 


[k [REol] = k (RE,) — k?Eo = — k°Ep. 


In the last equation we took account of the transverse character of 
the field Eo, i.e. the relation (REo) = 0. As a result of these trans- 
formations we obtain 


(k? — eo? — iowp) E, = 0. (6.229) 


Consequently, the amplitude Ey of a transverse electromagnetic 
wave in a stationary conducting medium can differ from zero only 
if the following equation is satisfied: 


k? — eno? — iopo = 0. (6.230) 


This condition is referred to as a dispersion relation. One can read- 
ily see that the same dispersion relation (6.230) should be satis- 
fied for the magnetic vector Hy in a transverse wave to differ from 
zero. In order to show this, one should multiply vectorwise the 
ee relation of (6.227) by & and then make use of the first re- 
ation. 

Let us assume that the wave frequency w is a given quantity and 
the field depends only on the coordinate z and on time. Then we 
can represent the fields E and H in the form 


E= E,eth2— it, H= Hyett?—iot, (6.231) 


It follows from (6.227) that we can assume the vector Ep to be 
directed in the positive direction of the x axis and the vector Ho 
in the positive direction of the y axis. Hence the three vectors k, 
E,. Ho form the right-hand triad of vectors. 

tn the case of a fixed frequency w the dispersion equation (6.230) 
yields the ‘ollowing values for the wave vector R: 


bom on/en +i ®. (6.232) 


In a conducting medium the wave vector & turns out to be a 
complex quantity. Hereinafter it will be important that the con- 
ductivity o is always positive. This can be deduced even from the 
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fact that Joule heat generated within a unit of volume of a con- 
ducting medium per unit of time is equal to Q = jeonaE = cE? > 
>0 
Let us assume the frequency w to be a positive quantity. Then 
in Eq. (6.232) the imaginary part of the radicand is also positive. 
Assuming that permeabilities e and pw are also positive, we find 
that the solution &; is located in the first quadrant of the plane of 
imaginary variables, i.e. the imaginary and real parts of the solu- 
tion 4, are positive: 
kj=ki+iki (ki, kf >0). (6.233) 


The second solution ke» differs from the first one only by sign: 
ko= — ky = — ki — iki, (6.234) 


and we can present both solutions by means of the single equation 
k=+k' +ik", (6.235) 


where k’ and k” are positive quantities. The quantity +k’ is the 
real part of the wave vector & while the quantity +k” is its ima- 
ginary part. 

Substituting Eq. (6.235) in Eq. (6.231), we obtain 


E= E,* kz, ef l(t k’z—ot) H= Hye* kz, el l(t k’z—at) (6.236) 


In these formulae one should take either all upper signs in the ex- 
ponent, or all lower signs (wherever there is a choice between “=” 
and “=-”). Thus we obtain the two solutions for the fields E, H: 
one proportional to 

eke ve! We-mn. (6.237) 


and the other proportional to 
ets set eaten (6.238) 


At first glance it seems that one of them, Eq. (6.237), attenuates 
exponentially as z increases (the factor e-*%), while the second 
one, Eq. (6.238), grows exponentially (the factor e*"2). In actual 
fact, both solutions represent damping waves. To make sure that 
this is so, let us consider, for example, Eq. (6.238). It can be 
treated as a wave of the type 


emt (k’z +00), (6.239) 


whose amplitude is equal to e*’?, i.e. grows exponentially with z. 
It should be borne in mind that such a representation is valid only 
if the imaginary part k” of the wave vector & is small compared 
to the real part &’, ie. the wave amplitude changes little at 
distances of the order of its wavelength. 
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The phase of the wave (6.239) is defined by the expression en- 
tering into the exponent: 


g= k’2 + at. (6.240) 
This phase is constant at 


z= — 71+ const, (6.241) 


i.e. constant phase planes move along the axis at the velocity 
Up, = — fk’. (6.242) 


This velocity vpn is referred to as a phase velocity of a wave and 
determines the wave propagation direction. The minus sign in Eq. 
(6.242) means that the wave (6.238) propagates in the negative 
direction of the z axis. However, if we move in the negative direc- 
tion of the z axis together with the wave, its amplitude, being 
roportional to the factor e*’?, will attenuate exponentially. It can 
Pe seen that the wave (6.237) propagates in the positive direction 
of the z axis at the same (in magnitude) phase velocity. The ampli- 
tude of that wave is proportional to the factor e-*’? and, conse- 
quently, also attenuates in the propagation direction. 

Thus, in a conducting medium there are two waves of a given 
frequency propagating in opposite directions and possessing phase 
velocities of equal magnitude. Amplitudes of each of these waves 
attenuate exponentially in the propagation direction. 

It follows from the Maxwell equations that the amplitudes Ep 
and Mp are interrelated. This relationship can be expressed, for 
example, by the first equation (6.227). Taking this equation into 
account, we can write down the expressions for the fields (6.231) 
in the form 

E= E,e* RZ, el l(t R’z—at) 


H= os [REQ] eF *”2 « ef (+ k’2—al), (6.243) 


Thus, in order to determine a free plane electromagnetic wave 
in a conducting medium, one has to specify not only the charac- 
teristics of a medium e, p, o, but also the field polarization, i.e. the 
direction and amplitude of the electric field Ep. 

From the solution of the dispersion equation (6.230) (see Eq. 
(6.232)) it is seen that the wave vector & is a complex quantity. 
Since we consider a unidimensional case when the field depends 
only on the coordinate z and time, we can assume the wave 
vector & to be directed along the z axis: 


k=k' +ik’, (6.244) 
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where both vectors k’ and k” are directed along the z axis. Then 
according to the third relation (6.227) the vector Eo is perpendi- 
cular to the z axis. Let this vector be directed along the x axis: 


E, = (Eo, 0, 0). (6.245) 
Then 
Hy = (0, Ho, 0), (6.246) 
1 


i.e. the magnetic field is directed along the y axis as it is seen 
from Eq. (6.243). 

For the sake of simplicity let us assume the quantity Eo to be 
real. Then Hy is a complex quantity since the expression for Ho 
(6.247) includes the complex quantity & (6.232). 

Let us represent the wave vector & in the form 


k=k' + ik” =|k |e, (6.248) 


where | & |= ~/(k’? + (k”)?, tang = k”/k’. Then the field equations 
(6.243) can be rewritten as 


E, —_ E,ve*”2 ef (kz ~ot), 


H, —_ rd k | Ege *”e! tk’2—ot +9) (6.249) 
Here we have used only the upper signs in Eq. (6.243), to make 
things simpler. 

It is seen from these equations that in a conducting medium the 
waves of the electric and magnetic fields are displaced in phase 
by the angle = arctan (k”/k’). When conductivity is absent, 
k’’ = 0 and the phase displacement disappears. 

The real physical fields E and H cannot be complex quantities 
so that a physical meaning may be ascribed either to real or to 
imaginary parts of Eq. (6.249). Taking, for example, the real parts 
of these expressions, we get 


E, = Eye-*"? cos (k’z — al), 


Hy = = | & | Ege-*"? cos (k’z — of + Q). (6.250) 


The imaginary parts of Eq. (6.249) also yield the equivalent solu- 
tions: 
E, = E,ye-*”? sin (k’2z — ol), 
ene. =k"? ein (Bf (6.251) 
Hy =a! Fl Eve Rk’? sin (k’z ~ of + Q). 


In conclusion, let us analyse Eq. (6.232) for the wave vector k 
in cases of low and high conductivity of a medium. Let us write 
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down Eq. (6.232), having chosen the plus sign in it for the sake 
of simplicity: 


t=on/en +i. (6.252) 


If the absolute value of the second term of the radicand is much 
less than of the first one (low conductivity), the following approxi- 
mate expression is valid: 


k=ovep tia/# Saw +k”. (6.253) 


In this case _ 
aoe, kv aa/¥ s. (6.254) 


It is seen from these formulae that in the case of low conductivity 
the wave (6.250) attenuates e times over the distance L which is 
inversely proportional to conductivity: 
1 e 2 

If e, p and o do not depend on frequency, the quantity L has the 
same magnitude for waves of all frequencies. 

In the opposite extreme case of high conductivity we may ignore 
the first term of the radicand of Eq. (6.252) as compared to the 
second one. This yields 


k=(I +9 4/22. (6.256) 


In this case the imaginary and real parts of the wave vector & 
are equal in magnitude. The distance L over which the wave at- 
tenuates e times is equal to 





1 2 


This quantity is referred to as a skin layer depth; the term arose 
due to the fact that a plane electromagnetic wave falling on a 
highly conducting body (metal) attenuates drastically so that the 
field differs from zero only in a thin surface layer, the depth of 
this layer being of the same order of magnitude as L. If o and uw 
do not depend on frequency, the skin layer depth is inversely pro- 
portional to the square root of the incident wave frequency. 

In the foregoing reasonings we assumed an electromagnetic 
wave frequency w to be an assigned quantity and determined the 
wave vector from the dispersion equation (6.230). We could pro- 
ceed otherwise by assigning a wavelength or its corresponding real 
wave vector & = 2n/A, and by calculating a wave frequency w ex- 


Maxwell Theory in a Relativistic Form 253 





pressed via the wave vector & and the parameters of a medium 
8, Hl, O: 
2 2 

o,2= it A/F. (6.258) 
We shall not analyse this expression at length; note, however, 
that both solutions w:,2 always have the negative imaginary part 
in the case of positive e, », and o, implying the wave attenuation 
with time. Indeed, the time dependence of the field has the form 
e~'o!, The quantity w is defined by the complex expression (6.258) 
which can be written as w = w’ + iw” where w’ is the real part 
of frequency and w” its imaginary part. When w” <0 one can 
easily see that the factor e-'® attenuates with time as e+®*. Do 
not forget that w” is a negative quantity while time varies only in 
a positive direction! 

Let us consider now the equations describing potentials of an 
electromagnetic field in a moving conducting medium. It is known 
that in a stationary uniform isotropic conducting medium the con- 
ductivity current jcona and the electric field E are interrelated: 


Teond =oE. 


When passing to four-dimensional designations, we must assume 
the current components to form a four-dimensional vector and the 
electric field to be expressed via the elements of a tensor of the 
second rank F,,» (see Eq. (6.29)). Therefore, in the relativistic in- 
variant notation the quantity o must be expressed via the elements 
of a certain tensor of the third rank: 


Im. cond = OmeiF et (6.259) 


One can easily check that the tensor of the third rank omg: can be 
expressed as follows: 


Omkt = + (8m2U 1 — Smt 2). (6.260) 
Substituting this expression in Eq. (6.259), we obtain jcona = oE 
in the frame where a medium is stationary (U,; = U, = U; = 0, 


U, = ic). If a medium moves at the velocity V, we get from Eqs. 
(6.259) and (6.260), and taking into account Eq. (6.29a): 


It, cond == OUGF p- (6.261) 


In accordance with Eq. (6.81) this relation is equivalent to the fol- 
lowing formulae: 


jeona = OT {E+[VBI}, Pcona= oT (3 BE). 
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The first of these relations has a straightforward physical mean- 
ing: this is Ohm’s law for a moving conductor. The factor at o tn 
the first relation defines the electric field in a stationary medium 
frame The second relation indicates that if a stationary conductor 
carrying a current is electrically neutral, an electric charge ap- 
pears on this conductor when it moves at the velocity V (see § 61 
for the physical interpretation of this phenomenon). 

In the case of a moving conducting medium Eq. (6.60) should 
be written down as follows: 


of mk 
Ox, 


where Sm, cond = jm,cona iS defined by Eqs. (6.259), (6.260) and 
(6.261). Taking into account Eq. (6.261), the last equation is re- 
written as 





= (Smt Sm, cond)» 


OF mk 
Ox, 





mi oF .4U, = Sm: 


Let us substitute into this equation the quantity fme from Eq. 
(6.200) in which the tensor emein is defined from Eq. (6.201). Then 
the last equation becomes the equation describing the components 
of the tensor Fme. If we now express Fme in the obtained equation 
through the field potentials according to Eq. (6.199), we shall 
obtain the equation for field potentials in a moving conducting 


medium: 
o? % 0 \2 
{sz 3(Mae)y ~ (Mgr) fon= 


=—2 
= —e(8ne +2 UnUs) se (6.262) 








CG] 
Ox, 


Here the potentials ®, satisfy the following additional stipulation: 


Oz “ 610) 
Ox, arn 5 UU ay, = opU,D, = 0. 





This stipulation is generalized from Eq. (6.210) to include the case 
of a conducting medium. 

When the sources Sz are absent in a medium, we get the fol- 
lowing system of uniform equations from Eq. (6.262): 


3? x eo \2 Ce] 
jag 3 (+ 3,,) — (a) ome 
This system defines propagation of free electromagnetic waves in 


a moving medium which in a stationary frame has a dielectric 
permittivity e, magnetic permeability » and conductivity o. 
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If a plane electromagnetic wave of the type (6.216) propagates 
in such a medium, the correlation between the frequency » and 
wave vector & of that wave takes the form 


— — — xT? (@ — kV)? + ion (o—RV)=0. — (6.263) 


This dispersion equation follows from the foregoing differential 
equation if one takes into account that the gradient operator 
? == O/Or is equivalent to multiplication by —ik, when applied to 
a plane wave of the type (6.216), and the operator of differentiat- 
ing with respect to time O/ot is equivalent to multiplication by iw. 
When the conductivity o of a medium turns into zero, Eq. (6.263) 
passes into Eq. (6.218). 

Having assigned the frequency » and propagation direction of 
an electromagnetic wave, we can define the magnitude of the wave 
vector & (and thereby the wavelength 1 = 2n/k) from Eq. (6.263). 
On the other hand, having assigned the magnitude and direction 
of the wave vector k, we can determine the wave frequency w. If 
one of these quantities, & or w, is assumed to be given, Eq. (6.263) 
becomes a quadratic equation with respect to the other quantity. 
This quadratic equation has complex coefficients, and therefore 
its solutions are also complex. It follows from the appearance of 
the wave (6.216) that if its frequency is complex, the wave is no 
more monochromatic and it either grows or attenuates exponen- 
tially with time. In this case an attenuation (or growth) index is 
equal to the imaginary part of the frequency w. When the imag- 
inary part w” of the frequency w is positive, the wave attenuates 
with time, and when it is negative, the wave grows with time. 

When in Eq. (6.263) the frequency w and propagation direction 
of a wave, that is the angle @ between V and &, are assigned, we 
obtain a quadratic equation for the absolute value of the wave 
vector &. The solution of this equation yields, generally speaking, 
complex values of &. In this case two conjugate complex roots of 
Eq. (6.263) correspond to the exponential growing or attenuation 
of the wave in space. From here on we shall confine ourselves to 
the case of low attenuation when the imaginary part of the solu- 
tion of Eq. (6.263) for & can be considered small as compared to 
the real part. In this particular case the sign of the imaginary part 
of & does not define whether the wave grows or attenuates in space. 
Indeed, let one of the solutions of Eq. (6.263), with w and 6 as- 
signed, be equal to k’ + ik”, where k’ is a real and k” an imagin- 
ary part of & The wave vector direction is defined by the unit 
vector vr so that 


k=kn=(k' +ik”)n. 


256 Special Theory of Relativity 





Let us direct the z axis of the Cartesian system of coordinates 
along the vector a. Then the wave (6.216) is written down as 


A= Age! ter) = Ayek"2 « ef wot-h'z), (6.264) 


When the attenuation is low (k’”< k’), Eq. (6.264) can be con- 
sidered to define a wave possessing the wave vector k’ and fre- 
quency w, with its amplitude varying according to the exponential 
law e*"2, Suppose k” is a positive quantity. To make any conclu- 
sions concerning the wave behaviour, one needs to know the wave 
propagation direction, i.e. the sign of its phase velocity. The phase 
velocity of the wave is equal to the ratio w/k’. Indeed, the plane 
of a constant phase of the wave (6.264) is defined by the relation 
wt — k’z = const, whence z =F! — sost . It is seen from the 
last relation that the plane of a constant phase travels at the ve- 
locity w/k’. When w/k’ > 0, the wave (6.264) propagates in the 
positive direction of the z axis. Then if k” > 0, the wave grows, 
and if k” <0, it attenuates. And when w/k’ < 0, the wave prop- 
agates in the negative direction of the z axis. Then if k” > 0, 
the wave attenuates in its propagation direction (although its 
amplitude does grow in the positive direction of the z axis). Thus, 
to determine whether the wave grows or attenuates, it is not suffi- 
cient to know the law according to which the wave amplitude 
varies in space; one has to know the wave propagation direction 
as well. 

There is a simple method making it possible to find out whether 
the wave grows or attenuates in the direction of its propagation. 
Let us analyse the expression wk’’/k’. When it is positive, the wave 
grows in the direction of its propagation; in the opposite case the 
wave attenuates. It can be easily seen that the expression wk’’/k’ 
is the product of the phase velocity of the wave by the decrement 
of its attenuation in space. 

Let us consider now the solution of the dispersion equation 
(6.263). Let the magnitude of the wave vector be equal to & and 
its direction form the angle @ with the velocity vector V of a me- 
dium. In this case RV = kV cos 8. The dispersion equation (6.263) 
is a quadratic equation with respect to the frequency w. Solving 
it for the case of low conductivity o and discarding all degrees of 
o exceeding the first, we get 


= 2)\—1 2 - Bcos 8 

),2 = (I-+%P?) {Ler kV cosO-tck-VA +i - op. [Pre |}. 
(6.265) 

where x=". B=~., r=(1—B)-", and A=14xF2(1 — 


c? 
— B’cos?@), This equation shows that if in the frame of a sta- 
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tionary medium ep > eopo, i.e. x = (n? — 1)/c? > 0, the imaginary 
part of the frequency w is always positive for both solutions, 
whatever the velocity V of motion of a medium. This means that 
for a given wave vector k the wave (6.216) always attenuates with 
time. The attenuation decrement is proportional to the conduc- 
tivity o. 

Now let us consider the case when the given characteristics of 
the wave (6.216) are the frequency » and the wave propagation 
direction defined by the angle 6. Then from the dispersion equa- 
tion (6.263) one can determine the magnitude of the wave vector & 
corresponding to the given values of w and 8. The solutions of this 
equation have the form &1,2=&{,2+ éki,2. In the case of small o 
we obtain after minor transformations 


chi =o (1 + xP?) (xBI?cos0 + DB), 

Qckt = — copl (1 — B® cos?) (1 + Bcos@ 0); (6.266) 
cki = — o (xB? cos @ + +/4) (1 — xB’? cos? 6), 

2ck3 = cop (1 + Bcos 0 V/A) (1 — xB*l? cos? 6)~' AW, 


Here the quantity A is always positive due te the assumptions 
made earlier. Using Eq. (6.266) one can obtain expressions for 
WR, o/k{, 2. They have the following form: 


__ opel (1 — B? cos? 8) (xBI? cos 8 + VA ) 
2(1 + xT) (1 + Bcos 0 V/A ) 
opel (1 + BcosO V/A ) 
2(xBr?cos8 + VA) VA 


It is immediately seen from this that if x>0, the products wk/o/kj.2 
are always negative, whatever the velocity of motion of a medium. 
This means that in a moving conducting medium the wave 
(6.216) always attenuates in the direction of its propagation. 
The only characteristic property of a moving medium is that when 
the velocity of motion of a medium satisfies the condition 
1 — xB*I? cos? 6 = 0 or B = 1/(1 + x cos?6), the real and imag- 
inary parts of the second solution change sign simultaneously. 

The potentials obtained by solving Eqs. (6.211) and (6.262) can 
be successfully used to solve other problems (see the bibliography 
at the end of the book), most of which, however, lie outside the 
scope of this book. 


@o 
rid i= ’ 
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CHAPTER 7 


OPTICAL PHENOMENA 
AND THE SPECIAL THEORY 
OF RELATIVITY 


Light, being a special case of electromagnetic waves, is des- 
cribed by the Maxwell theory. As we have seen in the foregoing 
chapter, the Maxwell theory meets all requirements of the theory of 
relativity, and therefore must accurately describe the properties of 
such a typical relativistic object as lignt. But even in the theory of 
relativity the propagation of light in vacuo holds a special position. 
We have already pointed out that the velocity of light in vacuo is 
the ultimate feasible velocity of signal transmission and an unat- 
tainable velocity limit for objects possessing a finite rest mass. 
Besides, at the basis of the STR lies the statement about the ve- 
locity of light in vacuo being the same in all IFRs. 

The Maxwell theory is a macroscopic theory. In this chapter it 
is convenient {o examine a microscopic approach and to some 
extent even quantum mechanical methods. We mean the introduc- 
tion of photons here. In some respect, the introduction of quantum 
concepts leads to a very descriptive picture. The utilization of the 
theory of relativity becomes indispensable when we consider op- 
tical phenomena associated with a relative motion of bodies (the 
Doppler effect, aberration). 

§ 7.1. Properties of plane light waves. The Maxwell theory 
shows that in a uniform isotropic medium (e = const, p = const) 
whose conductivity o is equal to zero, the time-dependent field 
vectors E and H (as well as D and B which are proportional to 
them) satisfy the wave equations 

1 OPE 


2 
OH =AH—-+ 5 0. 


(7.1) 


This signifies that in a uniform non-conducting medium the 
waves can propagate, whose phase velocity v= \/s/ep is defined 
exclusively by the properties of a medium. One of the possible so- 
lutions of Eqs. (7.1) yields the plane waves: 

E = Ene! @t-4n, HH = Hoe! wt-an, (7.2) 


where w is the circular frequency. Here the field vectors are 
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assumed to depend harmonically on time, and the wave vector to 
be directed along a normal to the surface of equal phases (the 
wave front). It follows from Eqs. (7.1) that the absolute value of 
the wave vector k is equal to w/v provided waves propagate in a 
medium and to w/c when they propagate in vacuo. Since n = 


= A/ ene, one can write in a general case 
k=—ns, (7.3) 


where s is a unit vector oriented along the propagation direction. 
In the case of vacuum n= 1. The phase of the wave (7.2) is 
equal to wt—kr and therefore the surface of equa! phases is 
defined by the equation wt— kr = const. At a given moment of 
time it represents the plane kr = const, with the vector of its 
normal directed along k (r is a conventional three-dimensional ra- 
dius vector). In the course of time this plane translates in space 
parallel to itself in accordance with the equation kr = const + at. 

The plane waves (7.2) must satisfy not only Eqs. (7.1) but also 
the Maxwell equations (6.56) and (6.57) in the absence of charges 
(op =0) and currents (j = 0); substituting Eq. (7.2) into the 
Maxwell equations we obtain the following results. In a plane 
wave propagating in a uniform medium the vectors E, H and k 
form a clockwise triad, i.e. they are mutually perpendicular and 
the vector cross product of any pair of them, taken in the order 
indicated, defines the direction of the third vector. 

As to the relationship between the amplitudes, the following 
equation is valid: ~/uH = ~/eE. Consequently, in the case of va- 
cuum, when B = poH and e = ge, we get E = cB. 

The direction of the Poynting vector S coincides with that of the 
vector k while its absolute value is equal to the product of the 
energy density in the plane wave and the wave propagation ve- 
locity v, i.e. S = wou, or S = wu(k/k) where w is the energy den- 
sity in the electromagnetic wave. This result has a clear physical 
meaning: the Poynting vector determines an energy flux across a 
unit of area oriented normally to the incident wave per unit of 
time. But the energy flowing across a unit of area per unit of time 
is contained within the cylinder whose directrix is formed by the 
contour of that unit of area and generatrix by straight lines par- 
allel to the wave propagation direction. The cylinder’s height 
should be taken equal to v. In this case the quantity vu defines the 
volume of the cylinder thus formed, and the product vw, the elec- 
tromagnetic field energy contained in the cylinder. All this results 
in S = vw. Note also that in a plane wave 

_ ef? + pH? ae 


while in vacuo w = e9£?. 


g* 
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The momentum of a unit of volume (the momentum density) of 
an electromagnetic field in vacuo g is equal to S/c*. In the case of 
a plane wave in vacuo when S=cw, we get g =(w/c) (R/k), 
whence 


g=v/c. (7.4) 


The field momentum density in a medium will be examined in § 7 
of this chapter. Recalling the invariants /,; and J» of an electro- 
magnetic field (§ 6.5), we find that in the case of a plane wave in 
vacuo both invariants turn to zero. This means that in any frame 
the vectors E and H of a plane wave are orthogonal, and the ratio 
of their amplitudes is always the same. In the frame K’ a plane 
wave must take the form 


E’ == Ege! w't'—8’r), (7.5) 


> 

The phase of the wave at the world point R(r, ict) cannot 
depend on the choice of a reference frame. Therefore, the phase 
wf — kr must be an invariant of the Lorentz transformation. Con- 
sequently, 


of — kyx — kyy — kz =0't! — kyx’ — kyy’ — kz’. (7.6) 


Substituting the transformation formulae for x’, y’, 2’ and ¢’ fromm 
Eq. (2.37) into the right-hand side of Eq. (7.6), we obtain 


of — ket — ky — bez =0'T (t —S x) — AAT (x — VO) — hy — be. 


This is an identity with respect to ¢, x, y, 2. Taking into account 

that & = o/c and ky =—s,, ky= sy, kz=—8, (s is a unit 

vector whose direction coincides with that of k), we get 

o=oT (1+Bs,), os,=oT(B+s{), os,=o's), os,=o's'. 
/ (7.7) 

Here k’ = — s’. 

From these equations one can easily obtain the formulae describ- 
ing the Doppler effect, that is the light wavelength variation when 
emitted by a source moving relative to an observer, and an aber- 
ration of light, that is the change in the direction of a light beam 
on transition from one inertial frame of reference to another. To 
eliminate reiteration, however, we shall derive Eqs. (7.7) in a 
somewhat different fashion and then, in the next section, investigate 
their consequences. 

We shall take the four-dimensional approach from the very 
beginning. It has been already pointed out that the phase of — kr 
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musi be an invariant of the Lorentz transformation. But this ex- 
pression becomes an invariant automatically if it is represented as 
a scalar product of 4-vectors (the scalar product invariance is 
shown in Appendix I, § 4). For this purpose it is sufficient to in- 


troduce the 4-wave vector k (k, it) along with the 4-radius vec- 


tor R (r, ict). Then, wf — kr=— RR. The introduction of the 4-wave 
> 


vector & is convenient mainly because we immediately obtain the 
rule for its component transformation on transition from one IFR 
to another. A plane light wave propagating in the frame K’ changes 
its direction and observed frequency on transition to the frame K. 
We shall see that an amplitude of a plane wave also varies. The 
transformation equations for quantities characterizing a light wave 
in the reference frames K and K’ can be readily obtained if one 
takes into consideration that in a plane light wave the conventional 


wave vector k, together with i(w/c), forms the 4-vector k. 
§ 7.2, A 4-wave vector. The Doppler effect. Aberration of light. 
Let us consider a plane light wave observed in the reference frame 


K’ and described by the 4-vector ?’. The frame K’ is chosen so that 
the light beam propagates in it in the plane (x’, y’) at the angle 0’ 
to the x’ axis. Write out the 4-vector components: 

ki =k’ cos 0’ = = cos0’, k=k’ sin 8 =— sin 0’, 
(7.8) 


k= 0, Rai 2 ai’ 


Now let us find the components of the 4-vector k in the frame K. 
In accordance with the general equations (4.10a) 


ki = (ej —iBki), keo= hi, ka=h3, ka =U (ki + iBRi). (7.9) 
Since k3 = 0, in the frame K the beam propagates in the plane 
(x, y) as well. Consequently, the 4-vector & has the components 
k (2 cos 6, = sin 6, Q, it) in the frame K. From the last for- 
mula (7.9) we find that 

iS=r(i2 +B cose’), 
or 


(ee oT (I + Bcos 6’). (7.10) 


Consequently, if in the frame K’ the light frequency is equal to w’, 
it will be different in the frame K in accordance with Eq. (7.10) 
(cf. Eqs. (7.7)). It follows from the first formula (7.9) that 


@ o’ -p; 0’ 
—cosO=IP (cos 6’ — Bi), 
c ce c 


oa=@ 
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or, if Eq. (7.10) is taken into account, 


o’ , __cos68’+ B 
cos 6 = — I (cos 6 +B)= 7 S520: (7.11) 
The second formula (7.9), together with Eq. (7.10), yields 
: o 2g VI- Bd. sin 6’ 
sin 6 =— sin® =T+Beos 0” sin 8 = T (14 Beos 0’) . (7.12) 


Using Eqs. (7.11) and (7.12), one can easily find the expression 
for sin 6’ in terms of the angle 6: 

sin 8 
T (1 — Bcos 86) * 


Note that Eq. (7.12’) is immediately obtained from Eq. (7.12) by 
substituting unprimed quantities for primed ones and vice versa 
and by taking the velocity V with the opposite sign. The equations 
obtained make it possible to interpret quantitatively the two optical 
effects: the Doppler effect and aberration of light. The Doppler 
effect which is observed for waves of any kind consists in the fact 
that in the case of a relative motion of a source and an observer 
(receiver) the frequency (of sound or light) determined by the 
observer differs from that measured in the reference frame in which 
the source is at rest. 

Let a source be at rest in the frame K’. Then the instruments. 
resting in that frame will determine the natural frequency @o of 
the light source (ao = o’). 

When determining the frequency o in the frame K, we need to 
ana the angle 6’ into the angle @. It follows from Eq. (7.11) 
that 


sin 0’ = (7.12’) 


cos8—B 


Oe 
cos 8 = 1—Beos6’ 


whence | + Bcos 6’ = (1 — B?)/(1 — Bcos 6), and, consequently, 
Eq. (7.10) can be rewritten in the final form: 


_. wv1I—B 


This is the equation describing the Doppler effect. An observer in 
the frame K will observe the radiation frequency w differing from 
the natural frequency wo of the source. The observed frequency 
depends not only on the relative velocity of the source and the ob- 
server (B = V/c), but also on the angle 6 at which light comes 
to the observer. 

In particular, if the radiation comes along the relative velocity 
direction, we observe the so-called radial Doppler effect. If the 
frame K’ is to the right of K, the source moves away from the ob- 
server and light propagates in the direction opposite to the x axis 
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(see Fig. 7.la). Consequently, cos ® = cosx = —1. Then from 
Eq. (7.13) we obtain the frequency w and period T = 2n/o: 
1—B 1+B 
o=oa/TeE: T=ToA/ToB- 


An observer receiving light from a source moving away from 
him finds a frequency to decrease. 


K kK’ 
Observer G=% Source 
—_—_—_—S 
Light V tx 
F (a) 
A 
Source G=0 Observer 
Light x 
Pa (b) 
Source 


Light’ \ 0-7/2 
Zz’ 
(¢) 
Fig. 7.1. The radial Doppler effect: (a) an observer and a source move away 


fom each other; (6) an observer and a source draw together, (¢) the transverse 
Doppler effect. 


Observer 


On the other hand, when the frame K’ is to the left of K (see 
Fig. 7.16), cos @ = 1 and a source approaches an observer: 


omoa/ peg. Ta toalTaE 


The frequency of light received by an observer increases as conl- 
pared to the natural frequency wo. To an accuracy of B? terms the 
last two formulae can be rewritten as follows (the easiest way is 


to multiply both the radicand numerator and denominator by the 
numerator): 


@=@(1 —B), w=«(1 +B). 
Both formulae can be combined: 


@ — W Ao 


@o @o 





= + B, 
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Thus, the radial Doppler effect proves to be an effect of the first 
order with respect to B. To an accuracy of the second order with 
respect to B the formulae obtained coincide with the classical ones 
following from the fundamental considerations (§ 3.3). 

When light is observed in the direction which is perpendicular 
to the source velocity direction, ie. @ = /2 (see Fig. 7.1c), we 
witness the so-called transverse Doppler effect. It is described by 


the formula 
o= a /1 —B 


and depends on B? this time. If a source motion velocity is non- 
relativistic, the binomial expansion yields 


= «(1 — BY/2). 


Since this is a second-order effect, its gbservation is much more 
difficult to perform as compared to the radial effect. No wonder, 
the transverse Doppler effect was observed as late as 1938 (Ives) 
when the relativistic formula was wholly confirmed *. We would 
like to point out here once more that according to the classical 
theory no radial Doppler effect should exist (cf. § 3.3). The radial 
Doppler effect arises only due to the relativity of time intervals 
between events. 

Let us rewrite Eq. (7.13) in the form that was used in § 6.15. 
We shall group the quantities pertaining to the frame K on the 
right-hand side: 


wp =To(1 —% cosé) =F (w — SV cose) =P (o—AV). (7.14) 


The natural frequency appears on the left-hand side while the 
tight-hand side contains the frequency observed in the reference 
frame moving at the velocity V, the light propagation direction be- 
ing defined by the vector k. 

Eqs. (7.11) and (7.12) coincide with the formulae derived di- 
rectly from the velocity transformation formulae; therefore they 
fully describe the phenomenon of aberration that was mentioned 
in § 3.6. 

In particular, the expression for an aberration angle follows from 
Eqs. (7.11) and (7.12): 

ne 7 
tang = 280 VI (7.15) 


B + cos 0’ 


To conclude this section, let us derive the transformation for- 
mula for a solid angle element written in spherical coordinates. 


* The details of Ives's experiments can be found in the monograph by Lands- 
berg G. S, Optics, 1976, “Nauka” Publishing House (in Russian). 
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We shall orient the polar axis in the direction of relative motion of 
two frames (the x, x’ axis). In the frame K’ the solid angle ele- 
ment dQ’ is written down as dQ’=sin 6’ d0’ dg’ = — d(cos 0’) dg’. 
Since the y and z coordinates do not change, the @ coordinate, that 
is the projection on a plane perpendicular to the motion direction, 
does not change either: gp = @’ and dg = dg’. From the formula 
preceding Eq. (7.13) it follows that 

1 — B? 


d (cos 6’) = — ( a? sin 6 dé, 


1 — Beos 
whence the sought for transformation formula is obtained: 


cr sin 8 d0dp = a8 


’ 1 
dQ’ = FT — Boos Fa—Beoset (7-16) 


since in the frame K the solid angle element dQ is equal to 
dQ = sin 8 d6 dq. 


§ 7.3. A plane wave limited in space. The transformation of the 
plane wave energy and amplitude. Let us calculate the components 
of an energy-momentum-tension tensor for the case of a plane 
wave. We shall orient the x’ axis along the wave propagation di- 
rection, the y’ axis along the vector E’ and the 2’ axis along the 
vector B’. With such a choice of axes, E; = Di = B, = H, = EL = 
= D, = Hi, = B,=0.The tensor Ti, takes the simple form (see 
Eqs. (6.128), (6.148) and (6.151)): 


—w 00 —iw’ 


i 000 0 te 
oS 000 0 G17) 
—iw’ 0 0 w’ 


We shall also need the components of the tensor Tj, in the case 
when a plane wave propagates in the (x’, y’) plane at the angle 0’ 
to the x’ axis. Such a transition is accomplished through a simple 
rotation of a coordinate system; the matrix of this coordinate trans- 
formation takes the form 


cos 6’ —sin®@’ 0 0 

i sin®’  cos8’ 0 0 
pclae | 0 10 
0 0 0 1 


Transforming the components of the tensor (7.17) through the use 
of the matrix (7.18) according to the general rules of tensor trans- 


(7.18) 
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formation, we reach the tensor Tix: 


— w’ cos?6’ —w’ sin@’cos68’ 0 —iw’cos 6’ 
- —w’sin@’cos®’ —w’sin?®’ 0 —iw’sin6’ 
Pi = 0 0 0 0 
— iw’ cos 6’ —iw’ sind’ 0 w’ 
(7.19) 


Therefore, specifically, 
wo =w, S,=cw cos’, T,—=—w cos’0: (7.20) 

Let us prove the following theorem: a plane wave limited in 
space along its propagation direction (such a wave is some- 
times called “a train of waves”) possesses a momentum and 
an energy making up a 4-vector similar to a 4-vector of energy- 
momentum of a material particle. (This theorem is a particular 
case of the more general theorem *.) To prove the theorem we have 
to know the formula defining a change in volume occupied by a 
train of waves on transition from one inertial frame to another. 
The difficulty arising here is caused by the fact that the train of 
waves moves at the velocity of light c so that the volume of the 
train cannot be measured in the proper frame of reference. It is 
impossible to introduce the reference frame moving at the velocity 
of light! However, one can bypass the introduction of the proper 
volume, having finally accomplished the limit transition to the ve- 
locity of light. 

Let a certain volume move as a whole in the frame K’ at the 
velocity v’, with its value being equal to Wo in the proper frame of 
reference. Then according to Eq. (3.28) 


v =P afi -F. (7.21) 


If one considers this volume in the frame K, its velocity v will be 
determined by Eq. (3.41), and, consequently, the magnitude 7% of 
this volume in the frame K will be equal to 


i= = or ji-e 3 
Y= wAAfi-S f=7, Fv = yee, 


1+ Beose’ 1+ Beos@’ 





* The genera! theorem is presenled in [13} § 57 and W Heiller's The Quan- 
tum Theory of Radiation, Oxford, 1954, § 2 This general theorem can be formu- 
lated as follows in the space region where the tensor Tie satisfies the condi- 





tion Tiz==0 and Ty = 0 at the boundaries, the components 74s consli- 


0 
Ox, 
tule the components of a 4-vector. 
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the second equality follows from Eq. (3.41). Here uv’ is the motion 
velocity in the frame K’. Now, passing to the limit v’-» c, we get 
the required formula: 


oe / vi—B 
Y=V7 T+ Boos 0" * (7.22) 
Thus, if a certain volume in the frame K’ is equal to ¥ we shall 
observe in the frame K, moving at the velocity V relative to K’, 
a volume whose magnitude is determined from Eq. (7.22). It is 
understood that there is a similar relation for volume differen- 
tials: 
— v7 V1 = B? = l ia 
d7=dVr 1+Bcos8’  T(t+Bcos0’) dy”. (7.23) 
Let us go back to proving the theorem. Applying the general 
tensor transformation formulae (A.I.31) to the tensor (7.17), we 
shall obtain the fourth line components of the matrix T,, in the 
following form: 
Ty = — iT?’ (1 +B), Tye =0, 
Tyg =0, Ty = Iw’ (1 + B)’, 
and for the tensor (7.19) the components 
Ty, = — iw’ (B+ cos 6’) (1 + Bcos 6’), 
T 4. = — iT’ sin & (1 + Bcos 8’), (7.25) 
T3= 0, T 44 = Pw’ (1 + Bcos 6’). 


Naturally, when 6’ = 0, Eqs. (7.25) pass into Eqs. (7.24). Let us 
show now that the components (7.25) and, consequently, in the 
specific case, (7.24) as well, transform vectorwise in the necessary 
frame of reference when multiplied by the volume or the volume 
element. Indeed, for example, 


Tid Y =P (—iw’ cos@ — iBo’) dv’ =P (Ti dv” — iBTy, dv’), 
TedVY =Ted¥’, Tyd¥ =T dV’ =0, (7.26) 
T,, dV =P (w’ + Bw’ cos 0) d¥’ = 0 (Tid + iBTi, dV’). 


Comparing the obtained formulae (7.26) with the vector trans- 
formation formulae, we conclude that the quantities Ty, Ta, Tas, 
T4,, i.e. the fourth line components of the tensor (7.17) or (7.19) 
multiplied by the corresponding volume element make up a 4-vec- 
tor. 

Of course, this result holds after integrating over volume or 
multiplying by a total volume, provided the tensor components Ty. 
do not depend on coordinates as it is the case in a plane wave. 


(7.24) 


268 Spectal Theory of Relativity 


Let us find the total energy of a train of waves in the frame K’ 
(see Eq. (7.19)): 


U'= \ Tad’ = \ w dv’. (7.27) 


The total momentum components of a train of waves are defined 
by the following formulae: 


Gat [Paw =4 (—iw’ cos 0) av” = cos’, 


se (7.28) 
Gy=4\ tar’ = sino’, +=0. 
Similar calculations can also be accomplished in the frame K. 
Eqs. (7.25) allow the total energy transformation to be performed 
directly: 
es = ne ( __w' dvr 
U=|Tuar =1? (1 + Bcos 0’) \ TEESSy = 
=I (1 + Bcos6’) U’. (7.29) 


Calculating also the total momentum components 
G,=+| Tar = + (— i) (B+ cos 0’) { wa” = 
= tr (B + cos 0’) U’ = | cos 8, (7.30) 


we used Eq. (7.11) in the last equality; in much the same way, 
using Eq. (7.12), we get 


Gy==)\ Ty d¥ =+U' sine’ =+T (1 +B sin6’)U’ sind=— sind. 


Thus, in any inertial frame of reference we can introduce the 
4-vector os j 

v7 - : . U’ 
P’ (=cos 6’, — sind’, 0, i=), (7.31) 


c 


> 
with P’? — 0 in all reference frames. 


It follows from the condition p2 — 0 that the light wave tn vacuo 
cannot be at rest in any inertial frame of reference. Comparing 


the components P’ (7.31) to the components R (7.8), we see that 
the transformation formulae for U’ and w’ must be the same. This 
implies that the ratio U’/w’ must be invariant. Consequently, the 
energy of the same train of waves turns out to be different when 
measured by different observers. The ratio of energies is equal to 
the ratio of frequencies of the monochromatic radiation which 
forms the train of waves. The frequencies are determined by the 
same observers who measure the energy. The train is supposed to 
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be long enough since otherwise it will not be even approximately 
monochromatic. 

From the formulae obtained the amplitude transformation law 
is easily established in the case of a plane wave. Indeed, from Eqs. 
(7.25) we get the following formula for energy density transfor- 
mation: 

w= wT? (1 + Bcos 6’), 


Comparing this expression to the frequency transformation for- 
mula (7.10) 
o=oT (1 + Bcos6’), 


we see that the energy density transforms as the square of [re- 
quency. Since the energy density is a quadratic function of field 
amplitudes of a plane wave, they transform according to the same 
rule as frequency does. 

To illustrate the treatment of an electromagnetic wave as a sys- 
tem whose momentum and energy form a 4-vector, let us consider 
how the angular distribution of radiation from a dipole oscillator 
is transformed on transition from the frame K’ in which the osci!- 
lator’s centre of inertia is at rest to any other IFR. In the reference 
frame in which the oscillator’s centre of inertia is at rest and the 
polar axis is directed along the oscillator’s axis, the radiation in- 
tensity dl’ in the direction (6’, 9’) is known to be equal to 
dI’(6’, »’) = const-sin? 6’ dQ’. But the radiation intensity d/ = 
= de/dt, ie. the energy radiated in a given direction per unit of 
time, is a relative quantity. Its transformation law is easy to 
establish; in the case of radiation de = cdP, where dP is a mo- 
mentum fraction escaping with the radiation in a given direction, 
it being known that de’ = cdP’. According to the Lorentz trans- 
formation 


de’ =T (de —V dP) =P (de — V dP cos 6) =I de(1 — Bcos 98), 
di’ => dt, 
where dt’ is the proper time. Having divided termwise the upper 


equality by the lower one, we get 
dl de de’ 1 al’ 


Then we immediately obtain the sought for result: 
sin?@’ dQ’ eenek« sin?6’dQ. 
r?(1—Bceos6) n IT (1 —Beos6)> 


(1 - —) sin? 6 


(1 -- ¥ ease) 


dl = const - 


=const- dQ, 
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where we have used Eqs. (7.16) and (7.12’). It is seen from the 
obtained formula that the angular dependence of radiation in the 
frame K, relative to which the oscillator moves, differs essentially 
from the angular dependence in the frame K’, especially in the case 
when V x c. In this case the maximum radiation is observed in a 
direction forming an _ acute 
angle with the oscillator’s axis 
(Fig. 7.2). 

It is worthwhile to consider 
how the radiation which is iso- 
tropic in the reference frame K’ 
behaves in the frame K. All the 
necessary formulae are available 
now. In this case d/’(0’, 9’) = 
= const-dQ’ and 





dl’ 
dl = T?(i—Bcos0) 
__ const r_ 
~~ T2 (1 — Bos 8) dQ! = 
____const-dQ__ 

Fig. 7.2. The variation in the angular ~ T4(1—Becos6)? 
distribution of radiation emitted by a v2 \2 
dipole oscillator on transition from the (1 = 
reference frame K’ in which the oscil- = const - ¢ dQ 
lator is at rest (8 =0) to the refer- V 9 3 . 
ence frame K relative to which it (: eos ) 


moves (B = '/2). The maximum radia- 


tion direction ts seen to be inclined to- 
ward the oscillafor's motion direction From the last formula the 


The axis of the oscillator is oriented “searchlight effect” in K can be 
in the oscillators motion direction. seen. The radiation concentrates 

around the direction 0 = 0, 
since the value of the denominator is the least at cos@ ~ 1, with 
the ratio V/c fixed. 

§ 7.4. The pressure exerted by an electromagnetic wave (light) 
on a surface. The pressure exerted on a surface of a body, i.e. the 
force acting on a unit of area, is defined by the momentum flowing 
across a unit of area and optical properties of the surface. The 
momentum flow is expressed via the spatial components of the 
energy-momentum-tension tensor 7,, which for a plane wave takes 
either the form of Eq. (7.17) or (7.19), depending on the propa- 
gation direction. When the wave propagates along the x’ axis, 
then, as Eq. (7.17) shows, the tension tensor has only one com- 
ponent differing from zero, that is Tz, —w’. To find the momen- 
tum flow across the given surface element, one has to define the 
direction of the normal to this surface a(n). Then (see Chap- 
ter 6) the momentum flow across the element dS with the normal a 
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is equal to (Fig. 7.3) 


Topta™s dS’ =T),n\mi ds’, 

because only one term of the double summation differs from zero. 
The magnitude of pressure acting on the unit of area normal to the 
x’ axis is equal to p’=T7i1=Tix =| w’ |. If a light pulse propa- 
gates at the velocity c, the unit of area receives the energy per 
unit of time equal to w’c = &’. . 
But we have seen that w’ = p’, 
whence 


p' =8'lc. (7.32) 


Consequently, the pressure of 
light is equal to the energy of 
an electromagnetic wave inci- 
dent on a unit of area per unit 
of time and divided by c provid- 
ed that the wave is absorbed. Fig. 7.3. The calculation of pressure 

Now let us determine the exerted by an electromagnetic wave on 
force exerted on a wall by a a surface. 
light wave incident on this wall 
at a certain angle and reflected from it. Let the incident angle be 
equal to 6. We shall denote the normal to the wall by a, and the 
unit vectors directed along the propagation of the incident and 
reflected waves by s and s’ respectively. The momentum flow across 
: unit of area will yield the pressure p whose components are as 
ollows: 





Pa a Taal, +t: r apa = (Top + Top) Ny, 


where Tog and Tog are the tension tensor components of incident 
and reflected waves. 

The components of Tag for the wave propagating at the angle 0 
to the x axis are given by Eq. (7.19). The three-dimensional wave 
vector of the reflected beam differs from that of the incident beam 
by the substitution of —6@ for 6. Let us introduce the reflection 
coefficient R so that w* = Rw. Keeping in mind that 7, = 
= Ti,cos? 6 = — wcos*8, and Ti2 = 74, sin 8 cos 8=— w sin 6 cos 6, 
we obtain the following expression for the normal force (pressure 
of light): 

Px = (w + Rw) cos?6 = w(l + R) cos?8 


and for the tangential force: 


py=w (1 — R) sin 8cos 8. 
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Let us write out the magnitudes of normal pressure px for the 
two most interesting cases. In the case of the normal incidence 
(6 = 0) the pressure is equal to 2w if the wave is reflected com- 
pletely and to w if it is fully absorbed. In the case of isotropic 
radiation px should be averaged over all directions, i.e. the average 
value of cos?@ should be taken. But the average value of the 
square of the direction cosine of the spatially-isotropic unit vector 
is equal to '/3. Thus in the case of a total absorption of isotropic 
radiation the pressure is defined by the formula p = w/3. 

Surely, all these formulae can be obtained by means of ele- 
mentary reasoning. Proceeding from the magnitude of electromag- 
netic field momentum density g = S/c? and that of the Poynting 
vector in a plane wave S = we, we obtain g = w/c (w is the 
energy density). When a plane wave falls on the wall at the 
angle 6, a unit of area takes up all the energy and momentum 
per unit of time which are confined within an oblique cylinder 
whose base is formed by the unit of area and whose generatrix is 
numerically equal to the light propagation velocity. The volume of 
such a cylinder is equal to c cos @. Therefore 1 m? of the wall takes 
up the energy & = wecos@ during | s while the momentum G, 
transmitted in the direction perpendicular to the wall during 1 s 
is equal to G, = gcos8-ccos ® = wcos?6. But the momentum 
transmitted to a unit of area per unit of time is just equal to the 
pressure p. Introducing as before the reflection coefficient R, we 
get p = w(1 + R)cos? 8. 

§ 7.5. The light frequency variation on reflection from a moving 
surface (mirror). Let in the frame K a light beam propagate at 
the angle 0. to the x axis in the (x, y) plane. A mirror located 
parallel to the y axis moves relative to the reference frame K at 
the velocity V. The light beam reaching the mirror is reflected from 
it. We shall find the frequency and propagation direction of the 
reflected beam in terms of the frame K. 

It is convenient to introduce the reference frame K’ fixed to the 
mirror. Then the problem is solved as follows. In the frame K the 
4-vector of the light beam is specified, i.e. the frequency and prop- 
agation direction of light are known. The frequency of light and 
the beam direction in the frame K’ are easy to find using the Lo- 
rentz transformation formulae. In the frame K’ in which the mirror 
is at rest the routine law of reflection is valid: an angle of incid- 
ence is equal to an angle of reflection. This implies that the 4-vec- 
tor of the reflected beam differs from that of the incident beam 
only by the sign of the wave vector component along the x axis. 
To obtain the 4-vector of the reflected beam in the frame K, one 
has to apply the Lorentz transformation once more. 

Now, let in the frame K the light beam of the frequency wo 
propagate at the angle @9 to the x axis in the plane (x, y). The 
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components of the 4-vector ho in the frame K will be 


k° = k° cos 6, == cos6,, k=0, 
(7.33) 


R= k9 sin 8, =—* sin By, 9 =i 2 = ik, 
Let us find the 4-vector Re of the same beam in the frame K’. 
According to the Lorentz transformation (4.10a) we get 
ki =I (ko + iBRi), k= ho RR, RST (&§ — iBRe). (7.34) 
The component &{ alters the sign on reflection from the mirror 


which is stationary in the frame K’. Therefore, the 4-vector R” of 
the reflected beam will take the form 


kv = —T (R94 iBRo), RY = ho, RV = Rk” =D (ko — IBA). 
(7.35) 


In the frame K the reflected beam will be described by the 4-vector 


‘4 which is derived from the 4-vector a via the inverse Lorentz 
transformation from the frame K’ to the frame K: 


p= T (k” — Be’) = — I? {(1 + B2) £9 + 2iBR} = 
= —T?5 {(1 + B%)cos 8) — 2B}, — (7.36) 


k,= R= sin O, k= =0, (7.37) 
k,=T (kf + Bay) =F? {(1 + BY) ko — 2iBRt} = 

= iT? “ {(1 + B’)— 2Bcos 6}. (7.38) 
Since k,=k§3=0, the reflected beam keeps propagating within 
the plane (x, y). Assuming that 


ki =—cos®, k= sin 6, 


= (7.39) 
kg =0, k,=t i 
we get from Eq. (7.38) 
= = a+ a 2 cos 05 : (7.40) 


Consequently, the frequency w of reflected light observed in the 
frame K is not equal to the frequency wo of incident light. 
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In the frame K the tangent of reflection angle is obtained in the 


following form: 


—_ L7 
tan6=y = 


(1 + B?) cos 6) — 2B ° 


sin Q (dl — B?) (7.41) 


It is seen from Eq. (7.41) that 6 + @. Therefore, the angle of 





(b) 


Fig. 7.4. Reflection of light from a mov- 
ing mirror (a) A mirror moves along 
the normal to its plane As a result of 
reflection, the frequency changes and 
the angle of incidence is not equal to 
the angle of reflection. (b) A mirror 
moves parallel to its plane The fre- 
quency remains unchanged on reflec- 
tion; the angle of incidence is equal to 
the angle of refleclion 


incidence and the angle of re- 
flection prove to be different in 
the frame K (Fig. 7.4a). 

It is worthwhile to write out 
the formulae pertaining to the 
normal incidence of light on the 
mirror. Let in the frame K the 
angle of incidence 8) = 0. Then 
we get 


cos 8 = — I, 


sin@=0. (7.42) 


It follows from Eq. (7.42) that 
reflected light also propagates 
along the normal to the mirror, 
although in the direction op- 
posite to the initial one. The 
frame K’ in which the mirror 
is at rest moved in the same 
direction in which light propa- 
gated. The frequency of light 
decreased on reflection. 

If the mirror moves toward 
the beam, the quantity B alters 
its sign, and therefore we get 

B 
o=a 7+, 
sin@=0. 


The frequency of light increases 
on reflection. 





cos§6=— I, 


Making use of this effect, one can determine the velocity of a 
moving object, e.g. an automobile. When an automobile moves 
toward an observer, the frequency change on reflection is found 
from Eq. (7.40) with an accuracy to within the terms of the order 


of B? (Ao = w — @): 


Ao 


@©  1—B ~ 





(7.43) 
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If, for example, the velocity of an automobile is 72 km/h = 20 m/s, 
then B = 20/3-10°+0.6-10-’. This relative change of frequency 
can readily be detected by means of standard instruments. 

We have considered the case when the mirror moves along its 
normal. However, the mirror may move parallel to its own plane 
(Fig. 7.40). In this case we have to modify somewhat the formulae 
used. Eqs. (7.33) and (7.34) remain unchanged; however, the sign 
of &2 is altered on reflection, and therefore, 


ki =T (hk? + iBRo), RY =— A ky =A, ky =P (kE — iBRS). 
(7.44) 


Since k,=3=0, the reflected beam remains in the plane (x, y) 
as before. Returning to the frame K, we get 


k, =T (k” — iBk’”) = h?, (7.45) 
ky = — k8, (7.46) 
ks =0, (7.47) 
k= (RY + iBR”) = B. (7.48) 


From Eq. (7.48) we immediately obtain that o = wo, and while 
writing tan® = k2/k,, we find that tan@ = —tan Qo, i.e. 6 = —O@p. 
Hence, when the mirror moves aa to itself the frequency of 
incident light is equal to that of reflected light, and the angle of 
incidence is equal to the angle of reflection (in the frame K). 

In conclusion let us write out the formulae describing the re- 
flection from the mirror moving along the normal to its plane in 
the non-relativistic approximation, that is in the case when the 
velocity of the mirror is small: B= V/e < 1. Ignoring all terms 
of the order B?, we get 


@ = @y (1 — 2B cos 8p), 
cos 8 = — cos 8) + 2B sin? 6, 
sin 6 = sin 8) + 2B sin 8) cos 8». 
In the case of the normal incidence on the mirror (8) = 0) the 


beam is reflected in the direction opposite to the initial one whereas 
the frequency varies according to the following law: 


o=0(1—24), (7.49) 


if the mirror moves in the same dircction as the light beam. If 
the mirror moves toward light, then 


o=o,(14+24). (7.50) 
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Eqs. (7.49) and (7.50) permit of a straightforward interpreta- 
tion. Reflected light can be imagined as going from an imaginary 
source positioned behind the mirror, with the velocity of this imag- 
inary source being equal to 2V. Consequently, if the imaginary 
source is replaced by the real one with the same natural Las ald 
@o, the frequency change defined by Eqs. (7.49) ang (7.50) will 
correspond to the Doppler effect for this source. 

The considered instances of light reflected from a moving mirror 
represent specific cases of the general problem involving electro- 
magnetic phenomena arising at a moving interface dividing two 
media. 

§ 7.6. Light quanta (photons) as relativistic particles. Relati- 
vistic mechanics presented in Chapter 5 dealt with particles pos- 
sessing a finite (differing from zero) rest mass. This 3 is manifested, 


in particular, by a 4-momentum of a particle P = mV being mean- 
ingful only on the condition that m + 0. The particles whose rest 
mass differs from zero are referred to as tardyons. All such par- 
ticles cannot reach the velocity ¢ through acceleration. This can be 
seen from the fact that the infinite energy and momentum are re- 
quired for these particles to gain the ultimate velocity (& = mc? y, 
p = myt, but if vu->c, the factor y oo). The solutions of all con- 
crete problems cited in Chapter 5 testify that v remains less than c 
in all cases. 

While investigating an electromagnetic field interacting with 
microparticles, physicists inferred that in such an interaction a 
microparticle, e.g. an electron, always gains a definite energy and 
definite momentum from the electromagnetic field. (For the sake 
of simplicity we discuss here a monochromatic radiation, i.e. a 
radiation of a given frequency w.) For the first time the assump- 
tion about an electromagnetic field imparting energy to an electron 
by definite portions (quanta) was made by A. Einstein in the 
framework of the photoeffect theory (1905). In order to explain the 
scattering of high-energy y-quanta by electrons, an electromagnelic 
field had to be assumed to transfer not only a definite energy to 
electron but a definite momentum as well (the Compton effect, 1923). 

These properties of an electromagnetic field interacting with an 
electron can be graphically described in terms of an interaction of 
“particles of light”, possessing a definite energy and momentum, 
with an electron. Of course, it would be extremely naive to imagine 
an electromagnetic field consisting of some kind of particles re- 
sembling billiard balls. This “particle-of-light” concept is perfectly 
suitable for describing energy and momentum exchange between 
a field and microparticles. This being borne in mind, the concept of 
particles of light, called light quanta, or photons, cannot lead to a 
misunderstanding. 
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What properties should we attribute to a photon in order to treat 
it as a relativistic particle? One of the photon’s properties, to wit, 
the relationship between its energy and momentum, can be ob- 
tained from macroscopic electrodynamics. Indeed, the pressure P 
of light falling on a wall from vacuum was given by the relation 
(7.32) 

P=&/c, (7.51) 
where & denotes the energy taken up by a unit of area of the wall 
per unit of time. Now let us imagine that photons fall on the wall. 
(Below it will be seen that the most essential is the fact that the 
energy and momentum are transmitted by quantified portions.) Let 
us depict light as a plane wave so that all photons move in the 
same direction. Suppose each photon carries the energy e and 
momentum p. If a unit of area of the wall absorbs N photons 
falling on it every unit of time, the wall gains the energy Ne and 
momentum Np. But the momentum gained by a unit of area of the 
wall per unit of time is just the pressure of light P* sothat P=Np 
and & = Ne. That is why from Eq. (7.5!) follows the relation 
between the energy and momentum of a photon: 


p=elc. (7.52) 


But in the case of relativistic particles the relationship betwecn 
the energy, momentum and velocity of motion is established by the 
expression p = (e/c?)v, whence it is clear that the relation (7.52) 
is valid only if v= c. Thus, a photon can be interpreted as a 
rclativistic particle provided it moves at the velocity c. 

Just as for any relativistic particle, the 4-vector of energy-mo- 


mentum P(p, ie/c) can be constructed for a photon in vacuo. 
Using the general formulae for the calculation of the square of the 


4-vector and taking into account Eq. (7.52), we get P? = 0. On 
> 


the other hand, for conventional particles P? = — mc? (see Eq. 
(5.47)). This means that the rest mass of a photon is equal to 
zero. In order to permit the (imaginary) particle to reach the ui- 
timate relativistic velocity, we had to discard the finite rest mass. 

The rest mass of a photon proved to be equal to zero, and at 
first sight this fact seems to be rather regrettable. We have got 
used to all bodies and particles in nature to possess a rest mass. 
Until quite recently mass was regarded as an indispensable attri- 
bute of matter taken for actually existing reality. Physicists were 
also inclined to believe that a rest mass defines individual features 
of every body or object. In classical mechanics any mass-possess- 


* According to Newton's law F = dp/dt. Dividing beth sides of this equality 
by the area on which the force acts, we obtain the pressure, ie. p= F/s = 
= (I/s) (dp/dt); it is the momentum increment transmitted to a unit of area 
per unit of time that appears on the right-hand side here. 
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ing entity could be traced, at least theoretically, in the course of 
time. 

Until the beginning of this century light was thought of as a 
mysterious phenomenon; even physicists doubted its material na- 
ture. But in 1901 P.N. Lebedev discovered the pressure of light 
experimentally. The pressure is caused by the momentum flux. 
Prior to this, there were no particular doubts that light carried 
energy. But if light possesses both energy and momentum, its ma- 
terial nature cannot be questioned. Although the rest mass of an 
individual light quantum is equal to zero, there is nothing re- 
prehensible about it. In nature there are objects whose rest mass is 
finite, and also objects of zero rest mass. The latter move at the 
velocity of light and cannot be stopped; throughout all reference 
frames their velocity is the same. When brought to a standstill, 
they terminate their existence, passing into other forms of matter. 
It is the very fact that the forms of matter possessing a zero rest 
mass convert into those with a finite rest mass (and back), that 
illustrates an equivalence of these forms. 

The expressions for the photon’s energy and momentum cannot 
be obtained in terms of relativistic mechanics. Contemporary 
physics, however, discovered that in the processes of emission and 
absorption, as well as in interactions with matter, light behaves 
as an assembly of quasiparticles, each of which possesses the 
energy fw and momentum fiw/c. Here fh is Planck’s constant, h = 
= 6.626-10-%4 J-s, and @ is the circular frequency of light. Every 
elementary act of interaction with matter involves one quasiparticle 
of this kind, called a light quantum by Einstein in his time. The 
energy and momentum conservation Jaws are valid in these in- 
teractions. The expression e = fw can be taken for the light quan- 


tum energy and p=7s for its momentum; here s is a unit 
vector directed along the light beam. Thus, if one treats a light 
quantum (photon) as a relativistic particle, its 4-vector energy- 
momentum takes the form P (hk, im), where R==ks, k = 


== 2n/i = w/c. Cancelling all components of Pp by the common 


> 
factor, which is Planck’s constant fh, we obtain the same 4-vector & 
again which earlier denoted the wave vector. This time, however, 


it is defined to fit a photon: k (k, ik), k = w/c, P == hh. Since the 
> 


four-dimensional momentum of a photon coincides, with an ac- 
curacy to within the factor h, with the four-dimensional wave vec- 
tor introduced in § 7.2, all results obtained for a wave are fully 
applicable to a photon. We mean here the formulae describing the 
Doppler effect, aberration of light, change of light frequency on 
reflection from a moving mirror. In terms of the photon theory of 
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light one can easily derive the formula describing the pressure of 
light. Indeed, let a photon fal] on a surface of a body at an angle 0. 
The normal component of its momentum is equal to hw cos 0/c¢ 
(Fig. 7.5). On absorption of the photon the wall acquires just this 
momentum along the normal’s direction. If the photon is reflected, 
the magnitude of the transmitted momentum depends on the reflec- 
tion coefficient, that is the photon reflection probability; let us de- 
note it by R. Then the momentum component transmitted normally 
to the wall on reflection of the photon is equal to (1+) (Aw/c)cos 8 
where R < I. If n denotes the number of photons in | m?, all the 
photons confined within an oblique cylinder whose generatrix is 
equal to ¢ will fall on 1 m? area 
of the wall per 1 s. The base of 
this cylinder is equal to 1 m? and 
its volume to ccos 6. Consequent- 
ly, nccos@ photons will fal] on 
Im? area of the wall per 1 s. 
Provided all these photons get ab- 
sorbed, the wall acquires the ener- 
gy fwnc cos 8 and the normal mo- 
mentum component 





ho Fig. 7.5. Calculation of pressure 
nc cos 6 (1 + R)—— cos 8 = produced by light The area of the 


oblique cylinder base is equal to 
= hon(1 + R) cos? 8. S=l. 


The momentum transmitted to a unit of area of the wall per unit 
of time is just the pressure on the wall; therefore, p= 
= w(Il + R)cos?6 where w = nhw is the energy density in the 
beam. This result coincides with the one derived in § 7.4. 
Unfortunately, today a “photon mass” is still defined by the for- 
mula taken from relativistic mechanics of particles with a finite 
rest mass, namely, mp, = e/c? (e = fw). First of all, the formula 
e = mc? holds explicitly for the case m 0 (see Chapter 5) and 
is quite irrelevant to the case m == 0. Besides, the mass mp, has 
no physical] sense at all. There is also no sense in speaking of the 
inertial properties of a photon: in all reference frames it moves at 
the velocity c, ie. a photon in vacuo cannot be either accelerated 
or decelerated (it can only be exterminated). In quantum statistics 
photons are treated as identical particles, this assumption giving 
correct results. However, if a “photon mass” had any meaning, the 
“blue” photon would have been “heavier” than the “red” one, thus 
violating the identity of particles. On the contrary, the only com- 
mon property of photons is their zero rest mass. Sometimes the 
photon mass mp, is used to explain the deviation of a light beam 
in a gravitational field. It is inconsistent. however, to treat such 
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a relativistic object as light in terms of classical mechanics. Need- 
less to say, the relativistic theory predicts the deviation of a light 
beam in a gravitational field without resorting to any photon mass. 
Finally, there are some who wish to have the “mass conservation 
law” in relativistic physics. Since a rest mass is not additive 
(§ 5.6), the new “masses” are introduced on the basis of the rela- 
tion m = &/c®. This is, however, quite meaningless to undertake 
since the conservation of such a “mass” is just a consequence of 
the energy conservation law which is always valid. To summarize, 
we can state that the introduction of a photon mass does not bring 
any advantages, complicating needlessly the subtle concept of mass 
(see §§ 5.6 and 5.7). 

Coming back to the zero rest mass of a photon, let us make some 
more remarks. There is no real IFR in which a photon would be 
at rest, so that the photon’s rest mass is an unobserved quantity. 
Just as meaningless is to speak of time flowing in the reference 
frame fixed to a photon. The zero mass of a photon does not at all 
signifies the absence of mass. For example, the temperature 0°C 
does not mean that the body lacks internal energy. It should be 
recalled here that in the STR there are the world lines of the zero 
length which are not less meaningful than all other lines. Surely, 
this is all due to the velocity of light being distinguished among 
other velocities. Apart from photons in vacuo there are also the 
“real” particles, neutrinos, moving at the velocity c as well. Their 
rest mass is also equal to zero and cannot be observed in an ex- 
periment. After all, the question whether the photon’s mass is equal 
to zero or not can be solved experimentally. There are methods 
capable of detecting the photon’s rest mass if it differs from zero. 
As more and more experiments of this kind are conducted, the 
lower limit of the photon’s “rest mass” slides gradually lower and 
lower. By the end of 1975 this limit reached the value m< 10-® kg. 

§ 7.7. Light quanta in a medium. The Vavilov-Cherenkov effect. 
The anomalous Doppler effect. It is seen from the previous section 
and § 6.12 that the photon’s momentum in a medium is determined 
according to Eq. (6.183) when proceeding from the Minkowski 
tensor and according to Eq. (6.184) when proceeding from the 
Abraham tensor. The photon’s energy remains constant on transi- 
tion from one medium to another, provided the oscillation fre- 
quency does not vary. Then what expression for the momentum 
should be used when the momentum conservation law is applied 
to “light quanta in a medium’? There is no direct answer to this 
question, and some considerations in this respect will be given at 
the end of this section. And now we shall show that if light quanta 
(photons) in a medium are utilized in the form given by Eq. 
(6.183), we can obtain useful results concerning radiation kine- 
matics, i.e. conditions imposed on a frequency and direction of ra- 
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diation. These conditions are defined by the energy and momentum 
conservation Jaws. We shall begin with the elementary derivation 
of the conditions for the Vavilov-Cherenkov radiation. 

In this case radiation is emitted by a particle which has no in- 
ternal degrees of freedom. We shall write down the conservation 
laws for an electron-radiation system. Of course, the conservation 
laws per se do not answer the question as to whether radiation 
will occur. This question can be solved by calculation on the basis 
of equations of electrodynamics. However, if the conservation laws 
do not hold, the radiation is absent a fortiori. 

Suppose a light quantum has been radiated. If the energy and 
momentum of an electron before the radiation were &o, po and be- 
came &), p; after it, the energy and momentum conservation laws 


take the form 
A& = &) — &; = ho, (7.53) 
A es 
Ap=po— Pi = — ns. (7.54) 


Written in such a form, the conservation laws presuppose that the 
change in the energy and momentum of an electron is connected 
only with radiation. We can easily find the required consequence of 
Eqs. (7.53) and (7.54) recalling that according to Newton’s law 
Ap = F-At; multiplying both sides of this relation by v and re- 
calling that Fu At = A@, we get 

A&® =v Ap. (7.55) 


Naturally, this relation is valid only for smal] momentum changes. 

Substituting Eqs. (7.53) and (7.54) into Eq. (7.55), we can 
cancel out fw, then note that vs = ucos@ where @ is the angle 
between the directions of the electron motion and the radiation 
propagation. Then the final kinematic condition for the radiation 
angle 6 will be written as follows: 


cos@=—, (7.56) 


This is the condition for the Vavilov-Cherenkov radiation. It is not 
satisfied if an electron moves uniformly in vacuo (n = 1), since 
|cos 6|< 1, and the electron’s velocity v is always less than c. 
Consequently, an electron moving uniformly in vacuo does not 
radiate. 

We obtained this result directly from the principle of relativity: 
a charge resting in a certain IFR does not radiate. This charge 
moves uniformly and rectilinearly relative to any other IFR. Ra- 
diation, however, either occurs in all IFRs or does not occur in 
any of them. Consequently, a uniformly moving electron does not 
radiate. This reasoning is not correct, however, for an electron 
moving in a medium since a new characteristic velocity, that 1s 
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the velocity of an electron relative to a medium, appears here, 
defining at the same time the “privileged” reference frame fixed 
to that medium. 

Note that in our approximation the final results do not contain 
the quantity f in spite of the quantum mechanical ideas used. The 
result obtained is a classical one. The quantum mechanical ideas 
were employed for purely methodical reasons. We make use of the 
two conservation laws which do not require an obligatory utiliza- 
tion of quantum mechanical concepts. 

It is easy to obtain the radiation condition with the recoil mo- 
mentum taken into account. Let us introduce the 4-vector of 
energy-momentum (or briefly, 4-momentum) of a light quantum in 
a medium 


tons: Gee). (7.57) 
Cc Cc 


The photon radiation by an electron must obey the conservation 
law for the 4-vector of energy-momentum, or, in other words, the 
energy and momentum conservation laws. Let the 4-momentum of 


an electron before the radiation be equal to Do. after the radiation 
Pp, while the 4-momentum of the light quantum to a, that is 


> . > 5 >A fA 
Polmyor, imyyc), p(myv, imyc), x(“"ns, i). 


The conservation law for the 4-momentum has the form 

> > > 

Po=pt+n, 
or in components, 

Poi — 1 = Pt 
Squaring the latter relation, we get 

Poy — 2Poi™, + j= Pi, 

where each term involves the summation over the index i. How- 
ever, due to the invariance of the particle momentum square pj, = 
= pi, and we get 

nj == 2p,,%,. (7.58) 
Cancelling by x, is prohibited here, of course: the left-hand and 


right-hand sides involve independent summation. We shall calcu- 
late the left-hand and right-hand sides of Eq. (7.58) separately: 


A 2 
Puik, = == nmy,(%s) —my,ho, ap= (+) (n? — 1). 


Equating these expressions and taking into account that vps = 
= v,cos 8, where @ is the angle between emitted light and the 
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electron propagation direction, we obtain 


2 
+ (77) (n? _ 1) = a NAMYoUp9 COS 8 — myofhto. 


2s 1 
NINYQVo 


{52 (n? —1) + myoc } = 


h wv 
ee @—yafi—S}. 


When considering absorption of a quantum instead of its radia- 
tion, we should alter the sign in front of Aw in the latter formula. 
When hw/mc? < 1, which is true for visible light and an electron, 
we get back to the classical condition for radiation (Eq. 7.56): 








c V 
cos 9 = =——Ph 
NVo Uo 








Radiation kinematics deals also with the problem of light chang- 
ing its frequency and propagation direction on transition from one 
IFR to another. We mean the Doppler effect and aberration here. 
Surely, these problems are solved easier in terms of the STR. In 
§ 7.2 we considered the propagation of light in vacuo. Here we 
shall obtain the requisite formulae for a uniform and isotropic 
medium whose refraction index is equal to n; the formulae to be 
obtained will prove to be quite different from the case of vacuum. 

In fact, all calculations differ only slightly from those performed 
in § 7.2 and therefore we shall present them only in a brief form; 
in return, we shall discuss the obtained results in detail. From the 


4-vector x we obtain the 4-vector of a photon in a medium, propor- 
tional to the former vector: 


> > > 
k (Sn, it), n= Ak. (7.59) 


Assume that in the reference frame K’ light propagates in the 
plane (x’, y’) at the angle 0’ to the x’ axis; a medium is at rest 
in the frame K’. Then 

>, @’ ? a’ . , ° @’ 
k (+ ncos6’, = nsin @, 0, i<). (7.60) 


c 


> 
The components of the vector & in the frame K are to be found 
from the same Eqs. (7.9) from which it is seen that the beam 
remains, as before, in the plane (x, y) of the frame K. Instead of 
Eq. (7.10) we shall get 


o oT (1 + Bncos 6’), (7.61) 
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and instead of Eqs. (7.11) and (7.12) 


ncos®’+B 
ncos§= T+ Bacos6”* (7.62) 
re - eas (7.63) 


I (l + Ba cos 0’) * 


Hence, the final formula for the Doppler effect takes the form (cf. 
Eq. (7.13)) 
o’ Vi —B 
© = T— Bncos 0 (7.64) 
as before, w’ = T'(w — RV) (cf. Eq. (7.14)), and for the aberration 
angle (cf. Eq. (7.15)) 
VvVI-B. i, 
tan® = Brae sin 0’. (7.65) 


Let a monochromatic source possessing the natural frequency wo 
be at rest in the frame K’, i.e. move uniformly at the velocity V 
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Fig. 7.6. (a) The Vavilov-Cherenkov radiation and the kinemalic explanalion of 

its generation The positions of a uniformly moving particle are shown for the 

moments of time ¢; and ft. In the time interval f2 —¢, the wave front will take 

the position shown by a dotted circumference line (6) The Cherenkov cone di- 

vides the space around the source into the regions of the anomalous and nor- 
mal Doppler effect 


relative to K. Then w’ = wo. It is seen from Eq. (7.64) that for 
n> 1, i.e. for such common media as, for example, water and 
glass, Bn cos 8 may exceed unity even at V < c, and therefore the 
denominator may turn into zero or even become a negative quan- 
tity. Since an altered frequency sign implies, at the most, only an 
alteration of an oscillation phase: 


cos (— of) =cosaf, sin (— of) =— sin of = sin (of + 1/2), 
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a frequency may be always regarded as a positive quantity. Consc- 
quently, the formula for the Doppler effect in a medium can be 
finally written down as follows: 


@ V1 — B? 


= ]1 — Bn cos 8| ° 


(7.66) 

First of all, note that from Eq. (7.66) it is seen that a medium 
does not affect the transverse Doppler effect: in the case of 8=n/2 
we get exactly the same Eq. (7.14) as for vacuum. Thus, we get 
one more evidence that the transverse Doppler effect arises only 
due to the relativity of time intervals between events. 

When 1 — Bncos6@ = 0, the denominator of Eq. (7.66) turns 
into zero, this being the condition for the Cherenkov radiation. 
A moving charged particle having no internal degrees of freedom 
radiates within a cone around its propagation direction. In the 
case of a neutral source the Cherenkov cone divides space into 
two parts with respect to the observed Doppler effect. The condi- 
tion 1—Bncos@>0 is valid outside the Cherenkov cone 
(Fig. 7.6) where we observe the normal Doppler effect for which 
(dw/d0) > 0, as it is always the case in vacuo. “Within” the Che- 
renkov cone | — Bncos@ < 0 and (dw/d®) < 0; this corresponds 
to the anomalous Doppler effect. 

It is interesting to compare the formulae for aberration of light 
in vacuo and in a medium. In the case of a normal incidence in 
the frame K’ the aberration angle a in the frame K is determined 
from the following relation: 


tana = 





1 
are (7.67) 


The only difference from the case of vacuum (cf. Eq. (7.15)) con- 
sists in the refraction index n appearing in the denominator. There 
is no singularity at V = c/n. 

And in conclusion we shall decide which expression for the 
photon’s momentum should be regarded as “correct”. As it was 
pointed out in § 6.12, two different expressions for the photon’s 


‘ i ho ho 
momentum in a medium, p™ = 2 and prs correspond to 


different subdivisions of the momentum density of an electromag- 
netic field in a medium into “the momentum density of a field” 
and “the momentum density of a medium itself”. Since considering 
the Cherenkov effect we are interested in a complete momentum 
being lost by an electron, and such a momentum is defined by the 
expression g™, the employment of g™ leads to the correct result *. 


* For more details see V. L. Ginzburg, UFN 110, 309 (1973); V. L. Ginz- 
burg, V. A. Ugarov, UFN 118, 175 (1976). 


CHAPTER 8 


ON CERTAIN PARADOXES 
OF THE SPECIAL THEORY 
OF RELATIVITY 


If one consults a dictionary, he learns that the word “paradox” 
has at least three meanings: an unusual judgement differing from 
a generally accepted one by its originality; a surprising deduction 
from certain assumptions; and a result which seems incredible at 
first glance but proves to be correct on more careful consideration. 
Studying the STR, we can easily find examples illustrating all 
meanings of the word “paradox”. 

As to the “generally accepted opinion”, it is the ideas of classical 
physics as a whole and the classical concepts of properties of space 
and time specifically, that are applied to the STR. The classical 
ideas of space and time coincide to a great extent with those con- 
cepts that we acquire in our schoolyears and in our everyday prac- 
tical life. These customary ideas have long become generally ac- 
cepted, and their employment rests, we believe, on the “common 
sense”. But eventually, the common sense is an accumulation of 
our settled down convictions among which there may also he false 
ones. These false convictions are being exposed as science devel- 
ops. Very often it turns out that certain ideas are true only 
approximately, and the field of their application proves to be 
limited. That is exactly what happened to some concepts when 
the STR appeared. 

In everyday life and in classical mechanics, for example, we got 
accustomed to time having absolute meaning. Of course, this is 
substantiated fairly well. The theory of relativity showed that the 
time moments of an event measured in different IFRs are different, 
i.e. relative. However, it is psychologically difficult to readjust 
oneself to relativity of time, especially as this relativity manifests 
itself only at relativistic velocities (which macroscopic bodies never 
attain) and never in everyday life. Relativity of time, relativity of 
simultaneity and time intervals between events correlate with rel- 
ativity of lengths of scales moving with respect to one another 
From the viewpoint of the common sense which makes us believe 
in absolute time, these conclusions are paradoxical. In terms of 
contemporary physics they are not paradoxes at all. Relativity of 
time is just a modern interpretation of measurement results oo- 
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tained for the time of an event. By the way, this representation 
conforms very well to the treatment of time in terms of dialectical 
materialism according to which time, as a form of existence of 
eternally moving matter, may depend on the motion of matter. 

The “paradoxical” results of kinematics in the STR are very 
well known and quoted in popular books. Here we mean relativity 
of scale lengths (“contraction” of lengths), relativity of simulta- 
neity and distances between events, and relativity of time intervals 
between events. All these conclusions deviating from the “gene- 
rally accepted” classical results were discussed in detail in 
§§ 3.1-3.3. In the next sections we shall consider the paradoxes 
which are further removed from the STR fundamentals. 

§ 8.1. Faster-than-light velocities. As we saw in § 3.4 the prin- 
ciple of causality requires the signal transmission velocity, that 





zz" 


Fig.:8.1. Two particles moving toward each other can “approach” at the velocity 
exceeding c. 


is the velocity of transmitting an energy and a momentum, to be 
finite. The motion of any particle whose rest mass differs from 
zero is, in fact, a signal since such a particle (when moving) 
carries along energy and momentum. Hence, it is clear that the 
motion velocity of such particles cannot exceed the velocity of light 
in vacuo. Kinematics of the theory of relativity shows that if in a 
given IFR the velocity of a particle v <c, then in any other IFR 
K’ its velocity vu’ << c (see § 3.5). Let us consider another useful 
example in this connection. 

Let two particles move toward each other in the frame K at 
equal velocities (Fig. 8.1). The only condition that the STR im- 
poses on the velocities of these particles is that they should be less 
than c. What is the relative velocity of these particles in K? Let the 
velocity of particle / (denoted by v,) be equal to v; then the veloc- 
ity ve of particle 2 is equal to —v. The relative velocity of the 
particles v;e: = 0; — Vg = v —(—v) = 2u. Whence it follows that 
if v > c/2, then ve > c. Can this velocity mean that a signal is 
travelling faster than light? 

In the considered case v,e; is the velocity at which the distance 
between the particles decreases. This distance actually decreases 
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at the velocity exceeding that of light. No “signal”, however, can 
be transmitted at such a velocity. 

To determine the possible velocity of a signal transmission, we 
shall do as follows. An observer located at particle / wants to 
transmit a signal (information) by means of particle 2. At the 
moment when the particles come alongside each other he delivers 
an “information package” to particle 2. But the velocity at which 
the information leaves particle / is not at all the velocity at which 
the distance between the particles varies in K, but it is the veloc- 
ity of particle 2 relative to particle /. In other words, the velocity 
at which information (a signal) is transmitted is the path covered 
by an information carrier per unit of time. The distance per se 
cannot be an information carrier. 

Therefore, in order to calculate the velocity of a signal trans- 
mission, one has to calculate the velocity of particle / relative to 
particle 2 (or vice versa). For this purpose, let us fix a frame K® 
to particle /. Assuming V = v, we obtain vj =0 from the general 
formula 

,_ __u—V 
fe 1—vV/c?° 


This is an obvious result, of course: K’ coincides with particle /. 
As to v3, it is calculated as follows: 
rene 2u —_ __ 20fe_ 
Oe THe? — T+ oye? — — TH eye 


This is the relative velocity of the considered particles. The last 
link of the equality is written down to demonstrate that oi <c. 
The demonstration is presented below for the general case. 

Let in the frame K the velocities of particles flying toward each 
other be equal to v; = ajc, vo = aoc; the theory of relativity re- 
quires only that conditions a; < 1 and a,< 1 be satisfied. The 
case when a < 0.5 and a2 < 0.5 is not of much interest since 
even in K there is no velocity exceeding c. Suppose a: + a > 1. 

Let us introduce the frame K’ where vi=0; we have already 
seen that in this case V = v,. Then 

iam Uea—V _— — ape — aye a; + a2 
ae T— Vie t+aa, 1+ aiaeg 


Let us prove that a -+a@2<1-+ aia2. We shall rearrange this 
inequality by transferring all of its terms to the right-hand side: 
142 — a@1 — a — 1 > 0. Grouping the terms, we get 

(a, — 1)(@—1)>0, 


which is correct since the condition imposed on q, and ae is satis- 
fied. The case considered earlier corresponds to a; = ap. 
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Thus, in a given IFR a conventional particle (m=4 0) cannot 
be accelerated to the velocity c. Nor this velocity can be reached 
as a result of a transition from one IFR to another. But is it still 
possible to find in nature velocities exceeding the velocity of light? 

The first example suggests itself immediately. Let us take a 
solid, absolutely rigid rod (body) and push it. Its both ends will 
simultaneously start moving which means that a signal will be 
transmitted instantly. However, here the initial assumption is 
erroneous. There are no absolutely rigid bodies in nature. All bo- 
dies are similar to coilsprings of different rigidity. The transmis- 
sion of a momentum (impact or push) from one end of a body to 
the other is accomplished in the form of motion of an elastic wave. 
And the velocity of elastic waves in solids is far less than the 
velocity of light. Thus, the STR stresses once more that absolutely 
rigid bodies do not exist in nature. By the way, an instantaneous 
change of a momentum requires an infinite force, even in the 
framework of Newton’s mechanics. 

Whereas the STR explicitly limits the velocity of signal trans- 
mission, no restrictions are imposed on the velocities which are 
not associated with the signal transmission and therefore they can 
exceed c. Usually a paradox crops up when a certain velocity ex- 
ceeding ¢ is found and claimed to be that of signal transmission. 
In the final analysis, it can always be demonstrated that the con- 
sidered velocity has no relation to the signal transmission. We 
shall discuss a few examples now. 

The straight line AB moves parallel to itself at the velocity V; 
directed normally to AB, and the straight line CD also moves 
parallel to itself at the velocity V2 directed normally to CD. The 
angle between the straight lines is equal to 6. What is the dis- 
placement velocity of the intersection point M of these straight 
lines (Fig. 8.2)? 





Fig. 8.2. The intersection point of two moving straight lines can move faster 
than light. 


The relative velocity of the point M along the straight line AB 
due to the motion of the straight line CD is equal to ug==V»/sin 6. 
The velocity of motion of the point M along the straight line CD 
due to the motion of the straight line AB is equal to wu, = 
== V,/sin 8. The geometrical summation of the velocities u, and up 


0-97 
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yields (Fig. 8.2) 
Vu mig Vit V3— 2V1V2c08 8. 


This formula shows that if 8-0, the velocity Va — oo, ie. it 
can exceed c. This fact, however, by no means contradicts the 
theory of relativity. In the first place, the intersection point of the 
lines is not a material body. Secondly, this point cannot be utilized 
to transmit a signal (information) because at any given moment 
it is being formed by the new points of the two lines, i.e. the in- 
tersection point cannot be “marked”. 

The case of an oblique incidence of a plane light wave on a 
plane surface (Fig. 8.3) is af somewhat greater interest. Consider 
the point of intersection of the 
wave front with the plane x = 0 
(the point A in Fig. 8.3). In the 
course of time this point moves 
to the right. It is easy to find 
the velocity of its displacement: 
having chosen the section BD 
equal to c, we obtain AD = 
= c/sin8. But AD is just the 
path covered by the point A per 

Ap = #2, - 0, unit of time, i.e. the velocity of 

sing sing the point A. Since sind <1, 

Fig. 8.3. The point of contact of an in- this velocity can be easily made 
cident electromagnetic wave with a greater than c. To dramatize the 
plane surface can move at a faster- situation, let us imagine the 
than-light velocity. plane x==0 covered with a 
luminescent paint. Then a lu- 

minous point will run along the axis at a faster-than-light velocity. 
Surely, a luminous point moving at the velocity vu > c can be 
realized simpler, so to speak, “manually”. Arrange electric bulbs 
along the x axis and switch them on one after another (indepen- 
dently) from left to right with a given time lag. Naturally, you can 
get a luminous point moving at any velocity. From this second 
example it is seen that in this process no information can be trans- 
mitted since each source radiates independently. But is it possible 
to attain faster-than-light velocities by means of a relatively slow 
rotation of a solid body of a considerable radius? For example, 
a disc of radius rc rotating at the angular velocity w = 1 
would possess the linear velocity v = c and over at its rim. How- 
ever, such a velocity cannot be reached due to relativistic properties 
of the motion equation. As the linear velocity of some sections of 
a body increases, the forces required to accelerate these sections 
become greater and greater, and consequently the linear velocity 
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ol the farthest sections of the body cannot exceed c in this case 
either. 

If it is impossible to rotate a solid body, then let us try to rotate 
a light beam. Let us place a searchlight at the origin of coordi- 
nates and start rotating it at the angular velocity . Let us cir- 
cumscribe a stationary sphere of radius c around the origin. Then 
the light spot will run along the surface of this sphere at the linear 
velocity 

v= Qe. 


This velocity can exceed the velocity of light. The example of such 
a beam is provided by a rotating pulsar. The light spot of the Crab 
Nebula pulsar runs along the Earth surface at the velocity equal 
approximately to 10? m/s. But as in the previous cases no signal 
is transmitted at such a velocity. As a matter of fact, every point 





Fig. 8.4. The light spot reflected from a rotating mirror can run along the re- 
moved screen at a faster-than-light velocity. 


of a screen (the Earth) receives a new portion of light energy 
from a searchlight (pulsar), but not from a neighbouring point of 
the screen. Therefore, it is impossible to transmit information from 
one point of the screen to another. 

In fact, the same idea can also be realized as follows. A light 
beam from the source / falls on a mirror consisting of several 
facets and rotating at the angular velocity w. Depending on the 
velocity w and the distance to the screen, one can get the motion 
of the light spot (the source’s image) at a linear velocity exceed- 
ing that of light. Let us fabricate a reflecting mirror in the shape 
of an ellipsoid and place a rotating mirror in one of its focal 
points (Fig. 8.4). Then the beam reflected from the mirror will 
pass through the second focal point in accordance with the well- 
known property of an elliptical surface. This second focal point 
can accommodate an analysing receiver. The light spot running 
along the mirror represents the image of the source regardless of 
the velocity of the spot. 

The phase velocity of electromagnetic waves in a medium can 
also exceed the velocity c. It is defined by the formula v = c/n 
where n is the refraction index. There are cases when the refrac- 


10* 
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tion index n < 1 and, consequently, v > c. All such cases relate 
to a medium and to certain frequencies of electromagnetic waves. 
For example, many substances have the refraction index n < | in 
the range of hard X-rays. The same is true for plasma. But there 
is no contradiction with the STR here again. The fact is that a 
signal transmission velocity is not defined by a phase velocity. In 
a dispersive medium, i.e. a medium whose refraction index depends 
on the frequency of light passing through it, a signal can be trans- 
mitted by means of electromagnetic waves whose frequency spec- 
trum is sufficiently narrow (a group of waves). The velocity of a 
signal is the velocity of energy transmission by such a group; 4s 
a more detailed analysis (see [36]) shows, the velocity of energy 
transmission is defined by the group velocity. But the group veloc- 
ity always turns out to be less than c, with the exception of the 
anomalous dispersion region where the group velocity formally 
exceeds c. In this region, however, the concept of a group velocity, 
and consequently the signal transmission velocity, loses its mean- 
ing. Thus, with the aid of wave processes, a signal is actually al- 
ways transmitted at the velocity less than c. 





Fig. 8.5. A rectangular loop with an elastic thread stretching a sphere along the 

frame’s diagonal. (a) The picture observed in the proper reference frame K°, 

(6) the same picture observed in the Irame K; (c) when the sphere is replaced 

by a dumb-bell, it experiences the raul ” a couple in terms of the reference 
rame A. 


§ 8.2. The thread-and-lever paradox. Let a plane rectangular 
loop ABCD be at rest in its proper reference frame K®. An elastic 
thread stretches a sphere of the mass mm from two sides along the 
diagonal AC (Fig. 85a). In the frame K® the direction of the 
thread is found from the triangle ABC. Designating AB = ap and 
BC = bo, we have 


tan ag = bo/ap. 
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Since in the frame K® the elastic forces are directed along the 
thread, we can also write that 
tana, = 5,/a, = F?,/F° (8.1) 


ly? 


where Fi denotes the force directed toward the apex C. Similar 
relations are valid for F2 as well. 

Now let us pass to the frame K relative to which the frame K® 
moves at the velocity V. As usual, we assume that the x» and x 
axes coincide and that the yo, y and 29, z axes are respectively 
parallel. According to the formulae (3.5) and (5.34) of length and 
force transformation we get 


a=, b = by (1 — BY), (8.2) 
Fii=Fip Fy=ryd— BY)", (8.3) 


1x? 
It is seen from this that Eq. (8.1) does not hold any more; in 
the frame K the angle defining the thread direction and the angle 
defining the direction of the forces are by no means equal: 


tana’ = b/a = (belay) (1 — B’)"* =P dp/ay = tan ag, (8.4) 
tana” = Fix/Fiy =(Fix/Fiy)/(1 — B’)"* =P Fh, / Fiy =P tangy. (8.5) 


Although the sum of the forces remains equal to zero as before, 
these forces in the frame K are directed at a certain angle to the 
thread (Fig. 8.55). At first glance this circumstance seems sur- 
prising. Indeed, what happens, for example, if we cut the thread 
at the section 2. In the frame K® the acceleration must be parallel 
to the force direction at the initial moment, i.e. it is directed along 
the thread (since this is an explicitly non-relativistic case, the con- 
ventional law of Newton is quite applicable). It seems that in the 
frame K the acceleration should be directed at a certain angle to 
the thread since the direction of the thread and the direction of 
the force F, do not coincide. These statements are clearly contra- 
dictory, but the paradox is resolved simply: in relativistic dynamics 
the acceleration does not, generally speaking, coincide with the 
acting force direction and although the force is directed at an 
angle to the direction of the thread, the acceleration is oriented 
along the thread. The paradox itself provides a useful illustration 
of the characteristics of the relativistic equation of dynamics. 

Let us make sure that in both frames the acceleration of the 
sphere is directed along the thread. It is convenient to write the 
telativistic equation of motion in the form 


m do/dt = y~' [F — (v/c*) (Fv)]; 
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here mm is the mass of the sphere, F the conventional three-di- 
mensional force acting on the sphere, v the velocity of the sphere, 
y= (1 —B?)"" where B= o/c. 

In the frame K® at the moment ¢ = 0 when the thread 2 is cut 


mdv/dt = Fi, 
or, in components, ‘ 
m dv) /dt = F} 


— m dv} /dt = F},. 

Having divided the first relation by the second one termwise, we 
obtain the formula defining the direction of motion at the initial 
moment: 


dv? /dvi, = Piel tg = tana). 


In accordance with Eq. (8.1) this acceleration direction coincides 
with the direction of the thread, as it should be. Thus, in K® the 
forces and the acceleration are parallel, and the motion is directed 
along the thread at the initial moment. 

Now let us pass over to the frame K. In this frame the sphere 
moves at the velocity coinciding with that of the frame K°, i.e. V. 
Therefore y =I and the acceleration components can be written 
here as follows: 


m dv,/dt =[F\, — (V/c*) F\, VT = F,,/T°, (8.6) 
m do, /dt = F,,/T; (8.7): 


we have taken into account here that the velocity of the sphere 
coincides with the velocity of the frame K, i.e. is equal to V and 
has the components (V, 0, 0); Fix and Fyy are the force compo- 
nents in the frame K *. In order to find the acceleration direction 
in K, we shall divide Eq. (8.6) by (8.7) termwise: 


do,/dv, = (F\;/F\,)/T? =P tana/l?=tana/l! =tana’. (8.8} 


In the third link of this chain of equations we utilized the formula 
(8.5) and in the last link the formula (8.4). But we see from Eq. 
(8.8) that the acceleration in K at the initial moment is also di- 
rected along the threads, and therefore no paradox can arise. 
Let us imagine, however, that instead of the sphere that was 
implied to be a point, the threads stretch some solid object, e.g. 








* It is easy to notice that Eqs (8.6) and (8.7) correspond to two exceptional 
cases of the relativistic equation when the force and acceleration are parallel; 
formerly the corresponding masses were referred to in this case as “transverse” 
and “radial” masses Then these 1ather incongruous terms have been practically 
discarded although they convey the tensor character of the relationship belweem 
the force and acceleration in relativistic mechanics quite satisfactorily. 
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a dumb-bell. Then in the frame K the spheres of the dumb-bell 
would be subjected to the couple (Fig. 8.5c) and the dumb-bell 
would be shifted relative to the diagonal of the loop. 

It is obvious, however, that in the proper frame the dumb-bell 
axis coincides with the diagonal. Certainly, we come across a 
paradox here. This paradox, however, represents a version of the 
well-known paradox of the lever which we shall discuss now. Let 
a lever be at rest in the frame K® (Fig. 8.6). It is in the state of 
equilibrium in spite of the fact that two forces, Fe and Fy, act on 
it, each of which is directed along the respective coordinate axis. 


kK? 





(a) (a) 


Fig. 8.6. The lever paradox. In the frame K® the lever is in equilibrium and the 

resultant force moment is equa! to zero. When the same lever is considered in 

terms of the frame K, the force moment different from zero arises in accordance 

with the formulae for length and force transformations. The STR provides a ve- 

ry elegant explanation why the lever is j rest in terms of the frame K (see the 
text). 


The equilibrium is ensured by the equality of the force moments 
in K°: 
Uy PY (8.9) 


these moments are oriented in opposite directions. 

The same lever can be considered in terms of the frame K rela- 
tive to which the lever moves as a whole at the velocity V. Having 
formed an expression for the moments of the forces Fy and F, in 
K, we note that they are no longer equal, and, consequently, a 
resultant force moment acting on the lever must appear. In fact, 
in accordance with Eqs. (5.34) and (3.5) we have 


F,=F, F,=FV1—B, 
L= aft —B, 1, =f. 


The difference of the moments of the forces Fs and Fy produces 
a torque in K 


L= Fly — Fy. = Py — (1 — B’) = BG = — BF aly, (8.10) 
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where Eq. (8.9) is used. Thus, the paradox consists in the follow- 
ing. although it is well known that the lever is motionless, in the 
frame K the lever is subjected to a force moment and, conse- 
quently, must be rotating. The witty solution of this paradox 
belongs to Laue. We have got used to a force moment inducing a 
rotation, or, in other words, causing the appearance of the moment 
of momentum in the system. In the frame K the force moment in- 
deed defines the velocity at which the moment of momentum grows, 
but this growth is not associated with the rotation of the lever. 
Then where does the increment of the moment of momentum coine 
from? Let us consider the work performed by the forces F, and F, 
in the frame K. In the frame K the lever moves, and the force Fx 
performs the work —F,V per unit of time. The force Fy, performs 
no work since it is oriented normally to the lever velocity direc- 
tion. Consequently, at the end of the lever, i.e. at the point of ap- 
plication of the force Fz, the work is performed, and the energy of 
the lever at this point increases by the quantity —F,V per unit of 
time. This means that the lever mass at the point of application of 
the force increases by —F,V/c? per unit of time. Multiplying this 
quantity by the lever velocity V, we obtain the increment of the 
momentum —F,B?. Hence, the moment of momentum grows by 
—F,l,B? per unit of time. This is precisely the resultant moment 
cited in Eq. (8.10). So, this additional moment does not describe 
the rotation but defines the velocity at which the system’s moment 
of momentum varies. This explanation has some weak points. In 
the STR there is no absolutely rigid bodies, and we must make 
allowance for the deformation of the lever. In the foregoing reas- 
onings the lever was tacitly assumed to keep its form. In the 
frame K® we must consider the lever’s arms bent by the forces 
FY and Fy. 

Considering this lever, we come across still another paradoxical 
result. Suppose, no forces act on the lever till the moment ¢ = 0 
when Fj and F? are “switched on” simultaneously in the frame K°. 
At every moment of time the equilibrium in the frame K° will be 
maintained. In the frame K, however, the forces will not be switched 
on simultaneously since there will be a time interval when the force 
F, is already acting and the force Fy has not been “switched on” 
yet. Consequently, a force moment arises. The following simple 
example shows that the forces applied at different points of the 
body are indeed essential. (The paradoxes appear, of course, when 
a solid body is considered ) Let in the frame K° a solid body of the 
length l° be located along the x° axis. No forces act on this body 
until the moment ¢ = 0 when the oppositely directed equal forces 
are switched on at both sides. In the frame K® the equilibrium is 
permanently maintained while in the frame K there is a time in- 
terval during which the forces are not balanced and consequently 
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the body should start moving. We shall leave this paradox for the 
reader to analyse. 

§ 8.3. The tachyons. This is the name for the particles whose 
velocity exceeds that of light in vacuo. From the very beginning 
we should notify the reader that we speak of hypothetical particles: 
the experimental attempts to observe such particles have not suc- 
ceeded so far. But the very idea of their existence seems para- 
doxical: the finite velocity of signal transmission is fundamental 
for the STR, the ultimate value being that of the velocity c. Surely, 
velocity per se has no limitations whatever (see § 8.1), but the 
signal transmission is the propagation of energy and momentum. 
The motion of particles to which we have got used can positively 
serve as a Signal. Besides, the conventional particles possessing a 
finite rest mass, with which we have made ourselves familiar, can- 
not reach the velocity of light. From the relativistic equation of 
motion for such particles it follows that the velocity of light can 
be reached only after an infinitely long time (not to mention the 
fact that an infinitely high energy would be needed in that case). 
Thus, the question about the faster-than-light velocity of particles 
in our conventional world no longer arises. 

One may, however, assume the existence of a special group of 
particles whose conversion into conventional particles and back is 
impossible. These particles, possessing faster-than-light velocities 
from the very beginning of their existence could have been gen- 
erated in certain nuclear transformations. The assumption con- 
cerning the generation of tachyons was evoked by the picture of 
photon generation: at the very beginning photons possess the ve- 
locity of light, and do not emerge “dynamically”, as a result of an 
acceleration of conventional particles. 

As in § 3.5 it can be shown that if the velocity of a particle uv 
exceeds c in one IFR, this is true in any other IFR. Consequently, 
the conventional particles (photons) and tachyons forin separate 
groups of particles; the transition from one group of particles to 
another by means of acceleration is impossible; the transition from 
one IFR to another leaves a particle in the same group to whicll 
it belonged in the initial IFR. 

Assuming the existence of such particles, let us consider the 
kinematic consequences of such an assumption. 

So, let us assume that the velocity of a tachyon v determined by 
the conventional means exceeds ¢, i.e. B =(v/c) > 1. Then for the 
interval between two events, the positions of a tachyon at two 
points in space at two moments of time, we get, as usual 


ds? = cdl? — dx? =c?(1 — B) de’. 


Here we consider the motion along the x, x’ axis. As distinct 
from the conventional particles, for a tachyon ds? < 0, i.e. the in- 
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terval is space-like. We saw in § 3.4 that in that case the concepts. 
“later” and “earlier” for two events are no longer absolute. Con- 
sequently, there are such reference frames in which a tachyon 
moves in one direction, and there are others in which it moves in 
the opposite direction. One can find the condition making the ve- 
locity of a tachyon reverse its direction in a certain reference 
frame K’. In the frame K’ we obtain for a tachyon 


Ar = 0 (at — 4 Ax) =F (1 — BB) Av. 


In any IFR we assume B < |. The time intervals At’ and At have 
opposite signs and this means that the sequence of events varies 


ct, K ct K" 





(0) 


Fig. 8.7. (a) The motion of a tachyon considered in two IFRs. In the frame K 

a tachyon moves to the right, in K’ to the left. The bold line represents the 

world line of the tachyon. (6) The reversal of the time sequence of events for 
a moving tachyon. 


with time, provided 1 — BB < 0. From this the required condition 
v > C/V follows; it is clear that v > c. The differences in de- 
scribing the motion of a tachyon in the frames K and K’ are clearly 
seen in Fig. 8.7a. In the frame K the simultaneity lines are parallel 
to the x axis, and drawing them farther and farther from the posi- 
tive ct axis, we mark the position of the tachyon more and more 
to the right: the tachyon moves to the right. In the frame K’ the 
simultaneity lines are parallel to the x’ axis. Drawing these lines 
till they intersect the ct’ axis farther and farther along the positive 
direction of the ct’ axis, we find the tachyon located more and 
more to the left: the tachyon moves to the left. 

The same result can be presented in a more dramatic manner 
(Fig. 8.76). Let in the frame K a tachycn that left the point O 
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arrive at the world point P. In the frame K, as it is seen in the 
figure, a tachyon was “emitted” at the moment ¢ = 0 (“earlier”) 
and arrived at the point P at the moment 4, i.e. “later”. The same 
figure features the spatial and temporal axes of the frame K’ in 
which the simultaneity lines are parallel to the x’ axis. It is seen 
in the figure that the tachyon in the frame K’ was earlier at the 
point P (at the moment —1?{), then moved to the point O to be 
absorbed at the moment 1’ = 0. In this way, we can obtain the 
movement of the tachyon in the opposite direction in space only 





Fig. 8.8. The observation of a luminous particle moving at a faster-than-light 
velocity. 


by a proper choice of the reference frame. As a result, in a certain 
reference frame we can observe the absorption of a tachyon instead 
of its emission. 

At the same time, we shall mention a curious picture of a 
“luminous tachyon”, that is a tachyon which radiates light. 
Fig. 8.8 illustrates that an observer at rest in the frame K will 
“sec’ two such tachyons diverging in the opposite directions. 

Now let us go back to the reversion of the sequence of events in 
time and, in particular, to the “emission” and “absorption” ex- 
change. At first glance, this situation contradicts the conventional 
cause-and-effect relations. Indeed, suppose it is known that the 
source of tachyons is located at the point O. The source is the 
“cause” of a tachyon generation. The motion of the tachyon toward 
P is the “effect” of the tachyon generation. The observation in the 
frame K’ shows, however, that the tachyon leaves the point P and 
is absorbed at the point O. No matter how strange it may seem, it 
must be admitted that the observed sequence does not contradict 
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the cause-and-effect relations, provided we define clearly what we 
mean by them. For example, one may argue as follows. 

We shall take A for the cause and B for the effect, provided that 
a repetition of the event A at the moments @), fo, ... chosen at will 
leads unfailingly to the occurrence of the event B at the moments 
ti +T, t2+T7,... . The essential point here is the controlled rep- 
etition of the event A and its correlation with the event B. In this 
sense the cause-and-effect relations do 
not depend on which event occurs 
earlier and which event occurs later. 
The sequence of events in time is not 
involved in the definition of a cause- 
and-effect relationship and cannot be 
used in order to differentiate between 
the cause and effect. 

In our example the event to be con- 
trolled in the frame K’ is the absorp- 
tion of a tachyon. This controlled ab- 
sorption will always be preceded by 





Fig. 8.9. The closed cause-and- 
effect cycle involving faster-than- 
light signals. Lines / and // are 
the world lines of two reference 
frames. The first faster-than-light 
signal AB is sent from the point 
A; AA’ is the simultaneity line 
A faster-than-light signal CD is 
sent in return from the point C 
of frame // which arrives at the 
point D of frame / before the 
first signal was sent (the point 
A). The simultaneity lines and 
the world lines of faster-than- 
light signals are drawn in accor- 
dance with Fig 265 


the motion of the tachyon from the 
point P toward O. We shall have to 
take the absorption of the tachyon 
for the cause and its motion for the 
effect. The cited definition of the cause 
and effect conflicts with the conven- 
tional statement that “the absolute 
meaning of the notions ‘earlier’ and 
‘later’ ... is a requisite condition for 
the concepts ‘cause’ and ‘effect’ to 
make sense”. Of course, if the 
“cause” and “effect” happen at one 
point in a given IFR, the cause must 


precede the effect. But then the in- 
terval between events is time-like a fortiori, and in any IFRs the 
effect will happen “later” than the cause. The tachyons behave quite 
differently. All “events” involving tachyons happen at different 
points, when considered from our point of view. The reversal of the 
sequence of events is of no significance. 

Hence, the reversal of the sequence of events in time does not 
contradict the conventional notions of the cause-and-effect relation- 
ship. However, there is one condition that had to be satisfied on 
all counts. It consists in the fact that it is impossible to exert in- 
fluence on the past from the present. A signal sent from a given 
point in space cannot get at it before it was sent. 

If tachyons served as signals, it could be possible, as it is seen 
from Fig. 8.9, to send a signal so that another signal caused by 
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the former one will get at the initial point (the cause-and-effect 
cycle) before the first signal was sent. Fig. 8.9 shows the world 
lines of two bodies, / and //, which were at rest initially, then 
moved uniformly and rectilinearly at equal velocities and finally 
came to rest again. The world points A and A’ are located at the 
simultaneity line coinciding for both moving bodies. The world 
points C and C’ are located at the simultaneity line coinciding for 
both resting bodies. The figure also illustrates the world lines of 
two faster-than-light signals AB and CD. Having sent the signal 
AB and then, on receiving it in another frame, the other signal CD, 
we will receive the signal CD at the point D before the signal 
from A was sent. 

Thus, we have analysed the example of a closed cause-and- 
effect cycle indicating explicitly the possibility of exerting influence 
on the past. Certainly, this result pertains to any faster-than-light 
signals, but when applied to tachyons, it implies that tachyons 
themselves, unlike the conventional particles, cannot serve as 
signals. 

If one assumes that tachyons exist and the requirements of the 
cause-and-effect cycle are satisfied, the resulting possibility of re- 
versing the sequence of events in time for tachyons allows the 
objections concerning the “dynamic” properties of these particles 
to be discarded. If one assumes the basic relations of the STR to 
hold for tachyons, the transformation formulae for the velocity and 
energy of a particle (see Chapters 3 and 5) 


v4 —B , 
p=fo5. 8’ =1s (1 —pB) 


show that the tachyon’s energy becomes negative in those ref- 
erence frames where the sequence of events reverses its order and 
where the sign of the velocity changes to the opposite since At 
and A?’ are of opposite sign. The negative energy of a tachyon 
is inadmissible since its existence would imply the possibility of 
obtaining unlimited energy. In fact, the joint generation of a pair 
of tachyons, one possessing the negative and the other the positive 
energy, would not require any energy expense and the positive 
energy tachyon could perform useful work. 

We have seen, however (see Fig. 8.8), that if in the reference 
frame K a tachyon is emitted and absorbed, in the frame K’ where 
the tachyon’s velocity obeys the condition v > c?/V the same pro- 
cess can be described as an absorption of a tachyon moving in 
the opposite direction and possessing the positive energy. This 
circumstance makes it possible to avoid the difficulty associated 
with the emergence of negative energies. 

And, in conclusion, a few remarks concerning the momentum 
and energy of tachyons. In a unidimensional case (px = p) the 
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STR yields (see Chapter 5) the following equation. 


8? — pc? = mic. (8.11) 

Plotting = &(p), we get a hyperbola; besides, as we have 
seen in § 5.5, 

v=d&/dp. (8.12) 


If the particle accelerates, it moves along the hyperbola (8.11) in 
the plane (&, p). The tangent slope is always less than c, irres- 
pective of the manner in which the particle's energy increases: 
whether via acceleration or due to the transition to another ref- 
erence frame. Since the particle’s energy is positive, the lower 
branch of the hyperbola is disregarded. Note also that the hyper- 
bola’s asymptotes #2 — p?c? = 0 correspond to photons. If one as- 
sumes tachyons to obey the basic formulae of relativistic me- 
chanics (see Chapter 5), the quantities p= myv and & = mc*y 
become imaginary since y= (1 —%)~" and B > 1, so that y = iy, 
where 1/y,=-/B?— I. The real values of momentum and energy 
can be obtained, provided we take the quantity im, for mass. Why 
is an imaginary mass any better than an imaginary energy and 
momentum? The point is, m, is the imaginary proper mass of a 
tachyon, and there is no reference frame in which a tachyon could 
be at rest (a reference frame consists of conventional particles and 
its velocity never exceeds c). Therefore the tachyon’s proper mass 
cannot be observed and can be assumed to possess any magni- 
tude. 

But then in the plane (@, p) we must consider two more hyper- 
bolas corresponding to the imaginary proper mass 2? — p?c* = 
= —m?c‘. Consequently, we must analyse three hyperbolas in the 
plane (&, p) (Fig. 8.10). The slope of the tangent of these hyper- 
bolas is more than ¢ at any point. Of course, the factor y appears 
not only in the expressions for a momentum and an energy, but 
also in the definition of length via the proper length and the defi- 
nition of time intervals via the proper time. However, we can 
readily reject the ‘ proper” quantities, assuming them unobservable. 

Having referred the reader elsewhere for details, let us sum- 
marize. In the past few years some attempts were made to analyse 
the properties of faster-than-light particles in terms of the STR. 
The STR says that the velocity that cannot be associated with the 
real physical propagation of anything can have any magnitude. 
Conventional particles always move at the velocity less than c, 
that is any “signal” propagates at a velocity less than c. Conse- 
quently, a tachyon cannot serve as a signal, i.e. its interaction 
with our world is quite restricted. It might be possible that the 
interaction of tachyons with our world is accomplished through 
the exchange of electromagnetic signals. 
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Proceeding from the motto “everything that is not forbidden has 
the right to exist”, we must admit the existence of tachyons. So 
far, in theory their existence is not directly prohibited. Still it 
seems highly unlikely that such particles really exist. The last 
word is with an experiment. 





Fig. 8.10. Tachyons and conventional particies depicted in the plane (&, p). 


§ 84. The clock paradox. This paradox, provided there is a 
paradox here at all, arises due to the difference in measurements 
of time intervals between events in different IFRs, which was re- 
peatedly discussed. We shall recall briefly the results that will be 
needed later. 

Let a body be at rest in the frame K’ and two events be regis- 
tered at the point x’ at the moments ¢{ and # by the clock mov- 
ing together with that body and the frame K’. The interval #3 —f{ 
is the proper-time interval to be denoted by At. The same two 
events will be registered by the observers of K at two points of 
the frame K by two clocks at the moments ¢, and to. The time in- 
terval between the same two events will turn out to be equal to 
At = t, — t,. We know that 


Ar= +1 —B? At, (8.13) 


i.e. the proper-time interval between events is less than the in- 
terval between the same events registered by the clock of the 
frame relative to which the body moves (cf. § 3.3). 

Eq. (8.13) clearly shows an asymmetry in time readings. It 
seems that we could argue as follows. Since all clocks in K are 
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synchronized, the time interval Af measured by different clocks of 
K can be equated to the time interval registered by one clock of K. 
Then it will turn out that the rates of the identical clocks in two 
IFRs, K and K’, are different. But the STR is based on the com- 
plete symmetry of inertial frames! And this symmetry does exist! 
We have missed an important point in our reasoning. Since the 
simultaneity is relative, the clocks synchronized in one frame are 
not synchronized in terms of another frame. The clock synchroni- 
zation is relative! The quantity Af is by no means the proper-time 
interval for a clock of the frame K. Let us make the requisite cal- 
culations. 

Let clock /// be at rest at the origin of the frame K’ moving 
relative to K at the velocity V. Clock / synchronized in the frame K 
rests at the point x, == a and clock // synchronized in the same 
frame rests at the point x2 = 06. 

The variable coordinate of clock //I in the frame K is equal to 
X3 == Vt. Consequently, the coordinates of clocks /, //, III in the 
frame K will be as follows: 


x, =a (for clock /), (8.14) 
X,==06 (for clock //), (8.15) 
x3== Vt (for clock ///). (8.16) 


From the Lorentz transformation formula (2.11) «=T'(x’ + Vi’) 
we can obtain the dependence of the coordinate x’ in the frame K’ 
on time ¢’ in this frame and the coordinate x in K in the following 
form: 


fo ’ ~*~ 
y= Vi++. 


In this way we shall find xj and x3; as to x3, it is obvious that 
x3=0. Consequently, 


x, =—Vt'+a/C (for clock /), (8.17) 
x,= — Vt’ + O/T (for clock //), (8.18) 
x,=0 (for clock ///). (8.19) 


As usual, we assume the readings of clocks from different frames 
to be comparable, when the clocks are located at one point. ‘Inen 
the following intercomparisons can be actually carried out. First, 
the reading of clock /// can be compared to that of clock / when 
the clocks are passing each other; we shall denote the respective 
readings of the clocks by ¢; and ¢,; secondly, the readings of 
clocks /// and //J can be compared when those clocks pass each 
other; we shall denote these readings by f and f (Fig. 8.11). 
When the reading of clock /// coincides with that of clock /, both 
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clocks are located at the point x’ = 0; therefore, in accordance 
with Eq. (8.17) the time moments fj and # will be given by the 
following relations: 4;=a/VI and #—=6/VI. At the same time 
from Eq. (8.16) we obtain ¢; = a/V, t = 6/V, ie. t;=Tt{ and 
4,=T. The readings ¢,; and ft) are the readings of different 
clocks synchronized in the frame K. 

Due to the synchronization in this frame clock / showed the mo- 
ment ft, when clock /J showed the same moment fo. The difference 
tg—t, is the time in the 
frame K during which the 
reading of clock /// changed 
by #— tj. In terms of the 
frame K the rate of clock /// 
is defined by the following 
relation: 


b—t=—(bh—h)= 
= (tf — t,) 4/1 — B’, (8.20) 


as it should be, since 4—4% 

is the proper-time interval. Fig. 8.11. The demonstration of the total 
Se le ae: he rile respect tor Umer deseleration 
moving clock observed from : Stlon 
the frame K is slow. We are {".any,ceerence frame the proper time in 
well aware of all this. Now _ be less than the time interval between the 
we pass over to taking a de- same two events registered by two clocks 
cisive step: we need to com- of any other IFR. 

pare the rates of clocks / and 

II as observed from the frame K’. To pass judgement on the clock 
rate, one should analyse the rate of one of the clocks, say clock //, 
However, we have only one direct reading of this clock: when 
clock // was against clock ///, the latter showed f2 and the former 
showed f.. The other reading of clock // has to be calculated (cf. 
§ 2.4). We shall find where clock // was located and what it 
showed at the moment clocks /// and / were against each other. 
Now let us analyse the situation in terms of the frame K’. When 
clock /// was against /, it showed the time ¢;=a/VI. Clock // 
was at the distance x3—x{=(b—a)/f from clock / (see Eqs. 
(8.17) and (8.18)). But when clock / was against ///, their com- 
mon coordinate was x{=x3=0. Therefore, x2=(6—a)/[ is the 
coordinate of clock // at the moment clocks / and /// coincide. Now 
it is easy to find the reading of clock // at the same moment of 
time. We shall introduce x3=(6—a)/f and t{=a/VI into the 
formula 





i=P (( +4’). 
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(Since the clocks are synchronized in the frame K’, /{ coincides 
with the reading of the clock of K’ located at the point x; so that 
j= 13.) As a result, we ie oe oe of ome I: 


t=? +P 7ot=nt+4b—a). (8.21) 


Were clocks / and // eee they would show the same 
time. But they are synchronized only in K and not in K’. We see 
that in terms of K’ the clock of K is dissynchronized, and the fol- 
lowing difference of readings appears 


V 
d= (6—a), 





which grows as the clocks move away from each other. We already 
obtained this result in § 2.4. Since in the frame K the distance 
b—a=V(t,—1t,), the reading of clock // will be t=¢t,+ 
+(V/c)?(t2 — t). Composing the difference of the marked time f, 
and the calculated time ¢, we shall obtain 


1 
—-—t=(h—-h) pF, 
or, in accordance with Eq. (8.20), 
b—t=(K—-t) + 


This means that the observer in the frame K’ will note that the 
clock moving relative to him is slow. Thus, the full equivalence of 
the frames is proved. 

This result confirms the full equivalence of the two inertial 
frames considered: if in two IFRs there are two identical clocks, 
the proper-time intervals registered by these clocks are equal. 
Surely, it cannot be otherwise, since one of the basic principles of 
the STR is the principle of relativity: if the rates of identical clocks 
were different in two IFRs, such a physical method of distinguish- 
ing these frames would be possible. 

This is only the introductory explanation that had to be made. 
The clock paradox is, of course, something different. Suppose we 
compare the readings of two clocks: one from the frame K and the 
other from Kk’. Naturally, immediately after the comparison the 
clocks will start differing from each other. Now the question arises: 
if we bring somehow one of these clocks back to the point where 
the other clock is located and intercompare their readings again, 
what should we expect? It is the answer to this question that is 
the clock paradox. This answer is far from being simple, and we 
wish the reader to arm himself with patience. 

First of all, we should point out that all formulae of the STR 
involve the quantities treated in terms of inertial frames of re‘- 
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erence. All time measurements carried out in the STR are made 
by means of clocks resting in one or another IFR. Having once 
compared two clocks, we are no more capable of getting them to- 
gether at one point in space without taking them out of the rei- 
erence frame in which they were at rest during the initial com- 
parison. Indeed, if the motion is rectilinear, one of the clocks 
should be decelerated and then accelerated in the opposite direction 
1o attain the same velocity. In that case the clock whose direction 
of motion was reversed will 
find itself at the same point 
as the clock against which 
the initial comparison was 
made. All this can be seen 
very well in the Minkowski 
diagram where the world 
lines of two clocks, / and //, 
are depicted (Fig. 8.12). 
The “clock paradox” is 
very convenient to analyse by 
means of K calculus (§ 3.7). 





We shall make use of the 
space-time chart of Fig. 8.12. 
It illustrates the world lines 
of three clocks: clock / lo- 
cated at the origin of K (the 
line OD), clock // resting at 
the origin of K’ (the line OT) 
and, finally, clock /// rest- 
ing in K” (the line TD). Let 
Js find the measured time in- 
tervals directly. At the mo- 
ment ¢=t =0, when the 


Fig. 8.12. The world lines of two clocks / 
and /!. The world line OD corresponds to 
clock / resting in K. Clock // first moves 
uniformly from clock / (the line O7), then 
having altered the velocity to the equal 
and oppositely directed velocity at the 
point 7, approaches clock / again. At the 
point D they get together and their read- 
ings can be compared again (the first in- 
tercomparison was made at the point O). 
This comparison of clocks amounts to 
what is called the clock paradox The in- 
set illustrates the world line of one clock 
getting back to the point D 


origins O and O’ coincide, the 

initial exchange of light signals occurs which takes no time since 
clocks from K and K’ are positioned at one point. Clocks // and II] 
get together at the world point 7; at this moment a light signal is 
sent from the point 7 to clock /. Let clock // register the proper- 
time interval At. between its encounters with clocks / and //I. 
Then, as we know, clock / must register the time interval k Ate 
between the encounter of clocks / and // and the arrival of the 
light signal from 7. But the signal from 7 was sent at the moment 
when clocks // and //] were against each other, and therefore if 
clock /// registers the proper-time interval Ats between the en- 
counter of clocks // and /// till its arrival at the point D, it is 
possible to find the time interval between the reception of the light 
signal by clock / at the point E and the encounter of clocks / and 
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I] at the point D. We saw in § 3.7 that if the sign of the relative 
velocity of the two reference frames changes to the opposite, the 
coefficient & changes to 1/k. Consequently, the time interval shown 
in Fig. 8.12 by the section ED is equal to Ats/k. From the symmetry 
of the imaginary experiment discussed here it is clear that Atp= 
= \t3. Designating the magnitude of this time interval by At, we 
conclude that the time interval registered by clock / between its 
encounters with clocks // and /// is equal to 
k (At) + A = (e+ 1). (8.22) 
The tofal time registered by the two observers (clocks // and 
III) is equal to 2At. This quantity is always less than the one 
given by Eq. (8.22) since from the inequality (k—1)?>0 tt 
immediately follows that 
k?+1> 2k. 


The obvious advantage of this approach lies in the fact that all 
time measurements are performed by means of clocks resting in 
inertial frames of reference. Thus, the time interval between events 
appears shorter when measured by two inertial observers, as com- 
pared to measurements made by one observer. Note that here, un- 
like the case when the time interval measured by one clock is com- 
pared to the time interval between the same events measured by 
two clocks of another IFR, one compares the time intervals mea- 
sured by clocks of three IFRs. 

So, the use of two clocks, // and ///, has led us to the conclu- 
sion about different measurements of time intervals. Sometimes 
they suggest to use the same clock in the reference frames Kk’ 
and K”: at the point T clock // is just delivered to the frame K’”’ 
to make it possible to measure the time interval in question by the 
same clock. This suggestion is worth dwelling upon. Although we 
measure the time interval between the events O and D by means 
of two clocks, / and //, these clocks are far from being equivalent 
in the considered case. When clock // is delivered from K’ to K”, 
it undergoes an acceleration and gets at a non-inertial frame. Its 
world line transforms into a curve (see the inset in Fig. 8.12). 
But the inertial motion is by no means equivalent to the non-iner- 
tial one. It is quite possible that the clock which moved due to 
inertia registers a longer time interval as compared to the clock 
which participated in the non-inertia] motion. There is no incon- 
sistency here; the same conclusion is obtained from the Einstein 
theory of gravitation. 

We have already mentioned (see § 3.3) that any acceleration, 
in principle, affects the clock rate Basically, the clock’s rate is 
“correct” in inertial frames of reference. Let the world line of a 
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particle be curved; this means that the particle undergoes an ac- 
celeration. At any moment of the accelerated motion we can find an 
inertial observer moving along a tangent to the actual motion path 
and having the instantaneous velocity of that motion. The clock 
moving with an acceleration has a “correct” rate provided it coin- 
cides with the rate of the clock constructed identically, but moving 
together with the inertial observer in the manner indicated. 

At what point of the world line does the difference between the 
readings of “inertial” and “non-inertial” clocks come about? Frotn 
the principle of relativity it follows that the rates of the clocks 
constructed identically are the same in all IFRs. Whence it is clear 
that the difference in the readings of two clocks brought to the 
same point in space is caused by the clock acceleration, i.e. by the 


I 


A B 


Fig. 8.13. Path / between towns A and B is shorter than path // although path 

I/ differs from a straight line only at a short section. The difference in length 

is caused not so much by a curvilinear section as by the fact that path // is 
not a straight line as a whole 


curved portion of the world line. One often hears the objection that 
the curved portion of the world line can be made as small as 
needed, i.e. an acceleration can be imparted for a very short time, 
while the accumulated difference in time readings can be very 
large. We should bear in mind, however, that an acceleration im- 
parted during a short time interval involves the immense forces, 
and the reversal of a relativistic velocity direction is associated 
with a considerable acceleration. Moreover, the difference in length 
between a curved world line and a straight line connecting the 
same points is determined not by the length of the curved portion, 
but by the overall curvature of the world line. This fact is very we! 
illustrated in Fig. 8.13: although path // from town A to town B 
goes “along a straight line for almost all of the time”, it is, no 
doubt, longer than path / going from A to B along a straight line 
connecting them. If an acceleration does not affect the clock’s rate, 
the length of the world line of a particle determines the proper- 
time interval. 

Until now we discussed time intervals registered by one or two 
clocks. Going back to the initial problem, one may ask what 
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clocks / and /// show at the moment of their encounter at the 
point D. We remember that the sets of clocks in K, K’ and K” are 
so synchronized that at the moment when the origins O, O’ and O” 
of the frames coincide the three clocks from the three frames show 
the reading ¢ = t’ = t’” = 0. Now pay attention to the diagram 
in Fig. 8.14. Here the world lines of clocks /, // and /// are sup- 
plemented by the simultaneity lines of the frames K’ and K’’. The 
transition from the frame K’ to the frame K”, that is to another set 







Uy eo 


Stmultaneity line of 
the framé K’ 
Semultaneity line of 
the frame K" 


Fig. 8.14. The transition from the frame 

K’ to the frame K” implies a change in 

the simultaneity iine. From the line AT we 
pass over to the line TB. 


of synchronized clocks, re- 
sults in a jump of the simul- 
taneity line from AT to TB 
(see the diagram). This tran- 
sition explains a substantial 
difference in readings of 
clocks / and ///. The substi- 
tution of two living organisms 
for two identical clocks leads 
us to the so-called twins pa- 
radox. The transition to living 
organisms, however, evolves 
a series of complications so 
that we refer the reader else- 


where [31]. 

§ 8.5. The “equivalence” of mass and energy. The zero rest 
mass. In this section we shall go back to the problems that have 
already been discussed; the main reason for this repetition is not 
the fact that new paradoxes will be disclosed, but that the oppor- 
tunity arises to analyse jointly some results that were earlier pre- 
sented separately. A few useful examples will also be given. 

We know from § 5.6 that any physical system possessing the 
energy &> in a proper reference frame (P® = 0) has a rest mass 
Mo = &>/c. In this relation & denotes all the energy contained 
in the system. We shall illustrate this by two examples. 

1. Consider a closed system consisting of n non-interacting ma- 
terial points involved in elastic collisions (in classical physics this 
model corresponds to ideal gas). Denote the rest masses of the 
points by m{), m?,..., mj and the four-dimensional velocities 


: 0 may 2 (a) 
in the proper frame K® by vo’, vo, ..., Uo - Let us pass over now 


to another inertial frame of reference K whose relative velocity 
direction coincides with the x axis. In K the three-dimensional ve- 
locity components of the points will be found by means of the fol- 
lowing formulae: 











(k) ae oe (k 
peso yo Vi— BP ot) = S0-V1 — BP (8.23) 
x (k) . y (k) ‘ Zz (2) ee a 
us oV vy ov vo 
1 x0 1 x. 1 x.0 
Te +a +a 
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Let us also make use of Eq. (3.17): 


of*) 
Alva) = ea Ee Es ghia (8.24) 
+ 


c 


which we shall rewrite using the designations adopted in this 
book: 


y= ryn(1 pe Hee Fe Pr, a), 


The resulting momentum of the system is defined as the sum of 
the momenta of individual particles: 


P= > miPy pt) (P, = > miby (R(t) — 0). 
Therefore, 


P= Smpyot = Sgr ye (ot + V) = 
=I DY) pyro, +0 2 ate 


It can be easily found that 
P,=P®=0 and P,=P)=0. 


Accordingly, the energy of the system is 
U, oV 
5 = J miMeryg = meaty (1 + 52") = 
=. >» mcr — 18, 





since 





V 
D met yy r= TV Simro, = 0. 
Consequently, in the case of a oe system 
P= r= ~V, Mo= 52, (8.25) 


where V is the motion ea: of the centre of inertia. This means 
that the rest mass of the system Mp is equal to @/c?. In terms of 
the kinetic theory of matter the rest energy #y of the system must 
include the thermal energy as well. However, we have already 
found that the rest mass of the system comprises not only the rest 
masses of individual particles, but also their total kinetic energy, 
that is the thermal energy when treated in macroscopic terms. 
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2. Consider an inelastic collision of two bodies. A system of two 
bodies can be regarded closed and therefore the conservation law 
for a 4-momentum can be applied to this process. Denote the rest 
mass of a body formed after a collision by Mo and the rest masses 
of colliding bodies by m{ and m®. The energy-momentum conser- 
vation law will be written in the Hur: dimensional form as fol- 
lows: 

mou + mu? = Mou, (8.26) 


where u, is the velocity of the single body formed after the col- 
lision. The first three equations of (8.26) for i= 1, 2, 3 permit the 
three velocity components of the single body to be found. As to the 
fourth equation (i = 4), it is written as 


my) 4 m@y2) = Moy. 


In the reference frame where the newly formed body is at rest 


i -(EY rc OF 


This equality can also be written as follows: 
= Se [mpoe? (ye) — 1) + me? (y® — 1)] + mY + me. (8.27) 


It is seen from Eq. (8.27) that the rest mass Mo of the newly 
formed system contains the sum of the rest masses of the initial 
particles mY + m@ and a certain additional mass associated with 
the fact that the relativistic kinetic energy of the two particles (the 
expression in brackets) has been transformed into some other 
kinds of energy (e.g. heat). Thus, in relativistic mechanics the 
energy conservation law includes all kinds of energy (and not only 
those usually taken into account in mechanics). 

Finally, it should be stressed once more that the relations ob- 
tained indicate the proportionality of rest mass and rest energy; 
it is far more important to remember that this fact is valld only 
in a proper reference frame. Generally speaking, the rest energy 
and rest mass possess different properties under the Lorentz trans- 
formation when treated in four-dimensional terms (§ 5.7). Thus, 
speaking of a conversion of “mass” into energy is meaningless 
although sometimes one hears such a statement. 

Now let us go back to a zero rest mass. Of course, from the 
classical standpoint a zero rest mass seems rather strange. We 
have seen (§ 7.6) that a zero rest mass should be attributed to the 
particles moving at the velocity c. In accordance with the con- 
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temporary ideas the particles of this kind are tight quanta (pho- 
tons) and neutrinos. As we know, the velocity c holds a privileged 
position in the STR since in all experimentally feasible IFR this 
velocity retains its value. We could finish here, but we would like 
to make some more remarks. 

Obviously, there is no contradiction in regarding the matter (in 
its philosophical meaning) possessing a finite rest mass to be 
equivalent to the matter possessing a zero rest mass. We shall see 
that the latter case is realized comparatively rarely in nature, but 
basically it is feasible. These two forms of matter just mentioned 
can pass into one another. Now we shall dwell on one example of 
such a conversion. This is the formation of electron-positron pairs 
by gamma quanta (high energy photons) and the reverse reaction 
of collision between an electron and a positron (this reaction is 
known under somewhat obsolete name of “annihilation” of par- 
ticles). This reaction brings to an end the existence of particles 
possessing a finite rest mass (an electron and a positron), leading 
to the appearance of two photons. What is essential, this reaction 
satisfies the momentum and energy conservation laws. Just as 
photons possessing a zero rest mass, so an electron and a positron 
possessing a finite rest mass are characterized by definite momenta 
and energies. The corresponding quantities resulting from this 
reaction remain the same; a photon as an objective reality is 
defined by its momentum and energy. The photon’s rest mass which 
is equal to zero characterizes a photon none the less than a finite 
rest mass of an electron and a positron. 

If the collision of an electron and a positron is considered in 
the frame fixed to the centre of inertia, that is the frame in which 
the particles move toward each other at equal, but oppositely di- 
rected, velocities vu; and vo, the energy conservation law takes the 
form 


moc? moc? = 
¢ c 

This equation shows that the total energy of an electron and a po- 

sitron is equal to the energy of two photons formed. If we take into 


account that in the frame of the centre of inertia v; == ve, the ob- 
served frequency of these photons will be equal to 


moc? 
eV" .29 
” es (8.29) 
From the momentum conservation law it follows that the ener- 
gies of the photons formed are equal: the momenta of the photons 
must be equal in magnitude (and oppositely directed), and the 
photon’s energy is proportional to its momentum. 
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When an electron and a positron move at non-relativistic ve- 
locities, the frequency of photons resulting from the annihilation 
of such an electron and a positron is equal to v = mc?/h which is 
in a good agreement with experimental data. 

The example presented here is far from being unique. We may 
also mention the decay of a neutral (x°) meson (possessing the 
rest mass equal to about 200 rest masses of an electron): n° — 2y. 

Let us consider now n photons of the same frequency moving in 
various directions. The energy of this system of photons is equal 
to the sum of the energies of individual photons: ® = )\ e, =nhv; 
the momentum of the system of photons P is equal to the sum of 
the momenta of the photons: 


P=) p, = (9, +8.+ eee +s,), 


where s, is a unit vector oriented along the propagation direc- 
tion of the ith photon. In accordance with the definition, the rest 
mass M of this set of photons can be found from the expression 


Mic? = 5 — pt (2) (2) tert... +8,)% (8.30) 


The right-hand side of Eq. (8.30) turns into zero only when all 
the photons propagate in one direction. This result was obtained 
in § 7.3: a limited train of plane waves has a zero rest mass. 
However, two photons whose propagation directions form a certain 
angle 8 possess a finite rest mass. Indeed, from the general for- 
mula (8.30) we get 


Mic? = (7*)"— (2ye + 2cos 6) = (2*)"(1 — cos? 3) (8.31) 


Thus, a cloud of electromagnetic radiation consisting of photons 
the rest mass of each of which is equal to zero possesses a positive 
rest mass, and, accordingly, induces a gravitational field and ex- 
periences a force of a gravitational field. 

Proceeding from the fact that even {wo photons possess a finite 
rest mass, we could try to avoid discussing the zero mass of a 
photon. But an individual photon can be observed in principle *, 
and, therefore, a zero rest mass needs to be interpreted. 

In order to clear up the cause that has led to the appearance 
of a “zero rest mass” it is expedient to use the four-dimensiona1 
concepts. Let us consider the 4-momentum of a particle of a finite 
rest mass m: 


Pp (myo, + &), & =mce’y. 


* O. Frisch, UFN 90, 379 (1966). 
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The rest mass is the norm of the 4-vector P: 
2 

p? = 2 _ p= mic, (8.32) 

which is an invariant. In the 4-vector of energy-momentum the 

energy is represented by a time component whereas the spatial 


components are the compo- 
nents of the three-dimension- 
al momentum. It is relevant 
to recall that the basic prop- 


erties of the 4-vector P coin- 
cide with those of the 4-vec- 


> > > 
tor V since P = mV. On the 
other hand, 
> 


R2 2 
de er ee 


> ds \2 
V=(S). (8.33) 
Therefore, in the case of 
the world lines of zero length 


(ds = 0) the rest mass of 
the corresponding particles 


turns into zero since P? = 0. 
Thus, photons move along 
the lines of zero length. 
There is still another ques- 
tion which seems paradoxi- 
cal at first glance. How does 
a photon, whose rest mass is 
equal to zero, transfer a finite 
test mass from one point to 
another? An absorption of a 
photon proves that this is 
really the case. For example, 
donating its energy to a 
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Fig. 8.15. A photon transfers mass al- 
though its mass is equai to zero, Before 
the radiation of a photon the energy of a 
catriage is equai to ®o. In a closed sys- 
tem the 4-momentum is retained, so that 
the total 3-momentum of the carriage and 
a photon is equai to zero as before and 
the total energy of the carriage and a 
photon is equal to &o. The mass of the 
system remained constant although the 
mass of the carriage decreased while the 
photon’s mass is equai to zero (mass is 
not additive!). When the photon is ab- 
sorbed at the other end of the carriage, 
the energy of the carriage becomes equal 
to ®o again, but the energy Av has already 
been transferred from one end of the car- 
riage to the other and the mass distribu- 
tion over the carriage differs from the 
initial one. 


solid body, a photon warms that body up and thus increases its 


rest mass. 


Let us analyse a simple example. At one end of a carriage cap- 


able of a frictionless motion a photon is emitted; then the photon 
is absorbed at the other end of the carriage. Prior to the emission 
of the photon the energy of the stationary carriage is ®) = Mc? 
(Fig. 8.15). Since the system is closed, its 4-momentum remains 
constant and the sum of the 3-momentum of the carriage and the 
photon is equal to zero as before. The cumulative energy of the 
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carriage and the photon is &». The mass of the system remains 
constant although the mass of the carriage decreased and the pho- 
ton’s mass is equal to zero. There is no reason to be frustrated 
about: mass is not an additive quantity! When the photon is ab- 
sorbed at the other end of the carriage, the energy of the carriage 
is again @o, but by that time the energy Av had been transferred 
from one end of the carriage to the other and the mass distribution 
over the carriage had become different from the initial one. 

Finally, we want to point out that the conclusions of the STR 
make us define more accurately the concept of a “closed” system. 
In mechanics a system is called closed if the constituent bodies do 
not interact with “external” bodies. An interacton is described by 
means of forces. In chemistry they prefer to call a system closed 
if it does not exchange any matter with the environment (then, ac- 
cording to non-relativistic ideas, the mass remains constant). Pass- 
ing over to thermal processes, we expect a closed system to be 
heat-insulated. However, the STR declares any energy transfer to 
be associated with a momentum transfer (this covers a heat trans- 
fer as well): an energy transfer leads to a change of a system's 
mass. These definitions can be combined into one, regarding a sys- 
tem in which an energy and momentum (a 4-vector of energy-mo- 
mentum) are retained as a closed one. An energy and a momentum 
remain constant in a closed mechanical system. Such a system is 
heat-insulated. In accordance with the conventional definition 
Mc? = (&/c)? — P? a mass of the system remains constant. Of 
course, the mass conservation law, when applied to a closed sys- 
tem, does not imply the additivity of masses in the system. This 
fact should be allowed for especially in the case of generation of 
new particles, 


SUPPLEMENT 


1. Who developed the special theory of relativity, and how? * 
(V. L. Ginzburg). The theory of relativity is one of the greatest 
scientific discoveries of all times; moreover, it was made in our 
century. The latter fact is especially significant in that the theory 
is not so much a part of the history of science (or, if you wish, 
not only a part of the history of science), but a physical theory 
with direct and very extensive applications. That is the main reason 
for the heightened interest in the story of how the theory of rela- 
tivity evolved. It required a reappraisal of the fundamental con- 
cepts of space and time, and thereby of the very foundations of 
classical (pre-relativistic) physics. Old concepts die hard and new 
positions are not easily won: the controversies and debates went on 
for decades. Moreover, they involved representatives of other 
sciences in addition to physicists. The theory of relativity has been 
and remains the focus of intense scrutiny. This, of course, is also 
true of its history — as an account of the evolution of ideas as well 
as an issue of priority. 

Thus it is that even today, seventy years after the enunciation 
of the special theory of relativity, people are still asking, Who in 
fact developed it, and how? 

Special relativity is most frequently associated with the name of 
Albert Einstein, with H.A. Lorentz, Henri Poincaré, and a few 
others mentioned as his predecessors. But there are other opinions 
which, for example, name Lorentz, Poincaré and Einstein as the 
joint authors of STR. What view is more justified, and what, in 
fact, is the argument all about? The answer to this question, as 
well as to the one in the heading of this supplement, should be of 
interest to the reader of this book. Below follow some remarks on 
this score. 

Three works are recognized as crucial to the special theory of 
relativity. The author of one (1904) was the Dutch Professor 
Hendrik Lorentz (1853-1928), one of the leading lights in theoret- 


* This is a revised version of V. L. Ginzburg’s article that appeared in the 
final form in the 1974 Einstein Collection (Moscow, 1976 in Russian), 
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ical physics, winner of the 1902 Nobel Prize in physics. The author 
of the second work (1906, a brief preview of which had been pub- 
lished in 1905) was the celebrated French mathematician Henri 
Poincaré (1854-1912), also famous for his research in physics and 
the methodology of science. Finally, the third work (1905) was 
written by a virtually unknown clerk of the Swiss Federal Patent 
Office, Albert Einstein (1879-1955). - 

It is common knowledge that new works of popular and favourite 
writers and poets immediately attract universal attention, whereas 
novices have to battle against stiff odds. In science this naturai 
tendency is, if anything, more pronounced. How come that, in the 
case of STR, it was the other way round, and it was Einstein's 
work that gained acclaim, nay, renown? A clear answer to this 
question was given by Wolfgang Pauli in his well-known article 
“Theory of Relativity”, first published in 1921 in the then presti- 
gious Mathematical Encyclopedia. Pauli’s article was subsequently 
reprinted and translated into other languages (the Russian trans- 
lation appeared in 1947). Pauli concludes his account of the history 
of the special theory of relativity with the words: “It was Einstein, 
finally, who in a way completed the basic formulation of this new 
discipline. His paper of 1905 was submitted at almost the same 
time as Poincaré’s article and had been written without previous 
knowledge of Lorentz’s paper of 1904. It includes not only all the 
esseritial results contained in the other two papers, but shows an 
entirely novel, and much more profound, understanding of the 
whole problem” [8]. Another eminent physicist, Max Born, recalls 
his impression after reading Einstein’s paper: “Although I was 
quite familiar with the relativistic idea and the Lorentz transfor- 
mations, Einstein’s reasoning was a revelation to me.” 

It is in this entirely new and profound elucidation of the problem, 
making it a relevation, that the success of Einstein’s work is root- 
ed, which is what made it fundamental to the enunciation of the 
special theory of relativity. 

A perusal of the history of science primarily focuses on two 
questions. The first is, How? How did ideas appear and evolve, 
how was a discovery prepared and made? The second question is, 
Who? Who made the discovery, voiced the idea, turned it into 
“flesh and blood”, elaborated it and drove it home to the scientific 
community? The question How? would appear to be the basic, 
primary one: it is connected with the very content of science and 
the methods of scientific research. The question Who? may seem 
secondary; indeed, it has no bearing on the essence of the matter, 
if we take, say, physics and not the psychology of scientific creativ- 
ity, the sociology of the academic milieu, or the life of this or 
that person. Actually, it is difficult, if not impossible, to draw the 
line between How? and Who? Science is advanced by people, and 
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if the end product — the totality of certain assertions, equations, 
relationships, etc. —is depersonalized or, more precisely, almost 
depersonalized, the initial process of the discovery of development 
of the equations and relationships reflects the characteristic, most 
typical traits of the discoverer. Thus, as far as the history of 
science is concerned, the questions How? and Who? must be an- 
swered simultaneously. 

We shall preface further remarks on this score with a few words 
about the special theory of relativity (this book, of course, covers 
the topic in much greater detail, but it is useful to sum up the 
situation here). 

One of the fundamental physical concepts is that of inertial 
frames of reference. A frame of reference used to define the coor- 
dinates and time of events is inertial if the law of inertia holds in 
it, namely that an isolated body (not subject to any forces) moves. 
unt ‘in a straight Tine. To be sure, this definition” is not 
immune from objections and must be clarified, insofar as it remains 
unclear what body can be regarded as isolated? Broadly speaking, 
a body can be considered isolated if all other bodies are sufficiently 
far away. An example of a “good” inertial system is a coordinate 
system with the origin at the centre of the Sun and the axes di- 
rected toward the remote stars. The inertia law holds with some- 
what less, but still sufficiently great, precision on the Earth (ne- 
glecting gravity). A reference system rotating relative to an iner- 
tial system is not inertial, the difference between the former and 
the latter being the greater the higher the angular velocity. 

If a given system is inertial, any other system moving uniformly 
in a straight line relative to it is also inertial. The generalization 
of this conclusion over all mechanical phenomena — the assertion 
that all mechanical phenomena occur absolutely identically in all! 
inertial systems — is just what the classical, or Galilean, principle 
of relativity is all about. More precisely, the definition and appli- 
cation of the principle incorporates the quite definite prerelativistic 
assumption concerning the connection between the coordinates and 
time of events in different inertial systems. Thus, if one such sys- 
tem K’ (coordinates x’, y’, 2’, and t’) is moving relative to a given 
inertial system K (coordinates x, y, z, and time f) with a velocity 
V along the positive axes x, x’ (the direction of which we assume 
to coincide), then, as assumed before special relativity, 


xi=x—Vt, yo =y, 2 =z, = 


(the Galilean transformations). 

The absolute nature of time — its independence of the motion of 
the reference system (whence the equality ¢’ = ¢)— was, of course, 
assumed to hold in all reference systems in general. 
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In uniform motion a body’s acceleration is, obviously, zero. 
Hence, in the Galilean transformations, i.e. in any inertial system, 
the acceleration is the same. Therefore, in these transformations, 
the law of dynamics, Newton’s second law (mass times accelera- 
tion equals force), remains unchanged as long as the mass, force, 
and acceleration remain the same in the K and K’ systems. The 
latter is assumed (and proved experimentally), and we come to 
the conclusion that the classical principle of relativity holds in 
Newtonian mechanics. Generally speaking, the invariance of the 
equations expressing the fundamental physical laws in the Gali- 
lean transformations is proof of the validity of the classical prin- 
ciple of relativity. 

Up till the end of the 19th century it was held that physics could 
be constructed completely on the basis of the Newtonian equations 
of motion. Thereby the classical principle of relativity was held to 
be invariably valid. However, the development of electrodynamics 
cast doubt on the classical principle of relativity. The equations of 
electrodynamics (Maxwell’s equations) do not retain their form 
in the Galilean transformations, whence their application leads to 
the conclusion that in electrodynamics the relativity principle 
breaks down and, in particular, light and all other electromagnetic 
waves propagate differently in different inertial systems, even in 
vacuum. If the “luminiferous medium” — the ether — introduced 
then is motionless in one inertial system (K), the velocity of light 
in it is c = 3 X 10° m/s irrespective of the direction. In other iner- 
tial systems K’ moving with velocity V relative to the ether (along 
the x and x’ axes), the velocity of light is, as is obvious from the 
Galilean transformations, c’ = c— V along the x and x’ axes and 
c’ =c-+ V in the opposite direction, etc. 

But experiments refuted that apparently obvious conclusion; all 
experiments, starting with Michelson’s famous experiment per- 
formed in 1881 and repeated many times since, confirm the validity 
of the relativity principle in electrodynamics as well as in physics 
as a whole. But how, in accordance with the relativity principle, can 
the velocity of light be the same in different reference systems, 
when the Galilean transformations lead to the opposite conclusion? 

It took almost a quarter of a century of agonizing quest to arrive 
at the solution constituting the core and basis of the STR: the 
Galilean transformations had to be wrong. More precisely, as is 
usual in such cases, they were not actually wrong, but approximate. 
The precise equations linking coordinates and time in the frames 
K’ and K have the form 


V 
t-—->2x 

se x—Vt 7 r_ at ce? 
Oe gn os et Ee 
a ea 
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(the Lorentz transformations). If the relative velocity V of inertial 
systems is small compared to the speed of light c, the Lorentz 
transformations become the Galilean transformations; hence the 
degree of accuracy given by the parameter V?/c?. For a satellite in 
orbit not far from the Earth V = 8 X 10° m/s, and V2/c? ~ 10-°. 
The velocity of the Earth around the Sun is V ~ 3 X 104 m/s, and 
V2/c? = 10-8. It is obvious from these examples that in the domain 
of the phenomena we encounter in everyday life the Galilean trans- 
formations, and the Newtonian mechanics associated with them, 
are valid to a high degree of accuracy. But in electrodynamics, and 
in studying relativistic particles, travelling at velocities, v, com- 
parable with the speed of light, c, in vacuo, the Lorentz transfor- 
mations are required. One of their corollaries is the equation 


x + y? + 2? a Cr = x? + y”? + 22 — ct”, 


Remembering that the equation of the front of a spherical light 
wave has the form x? + y? + x? —c7t2 = 0, the above equation 
immediately testifies to the validity of the relativity principle in 
the propagation of light: in all inertial systems the speed of light 
is the same and equal to c. 

The special theory of relativity precisely represents the theoret- 
ical constructions based on the principle of relativity and the 
Lorentz transformations. The principal feature of the STR is the 
new spatio-temporal concepts as reflected in the replacement of the 
Galilean transformations by the Lorentz fanslormations The 
meaning of the latter, physically speaking, is not restricted to the 
simple equations linking the coordinates and time x’, y’, z’, t’ with 
x, y, 2, & As always in physics, it is necessary to establish the 
meaning of all quantities, state the basis of the methods employed 
to measure the coordinates and time, and clarify the properties of 
the rulers and clocks used for this. One of the problems is that of 
synchronizing the clocks in each of the frames K and K’. The co- 
ordinates and time that appear in the Lorentz transformations are 
so defined that events simultaneous in the frame K (time ¢) are 
not simultaneous in the frame K’ (time ¢’). The rejection of abso- 
lute time is an especially radical conclusion (for which we are 
indebted to Einstein). In importance and difficulty it can be com- 
pared with the rejection of the idea that the Earth was stationary 
on which Copernicus based his heliocentric system. 

Now we can directly attack the question: Who developed the 
special theory of relativity, and how? 

The road to the STR lay, as is apparent from what has been 
said, through a fundamental difficulty that had to be overcome: 
the principle of relativity holds experimentally in electrodynamics 
as well as in mechanics, but it is incompatible with the Galilean 
transformations. To be sure, Lorentz and others sought to remove 
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the contradiction without rejecting the Galilean transformations 
by assuming that all bodies moving with respect to the ether con- 
tract. If a ruler whose length at rest relative to the ether is fp is of 
length J~/1 — (V/c)?> when moving at velocity V, then we can 
explain why some experiments do not reveal the motion of bodies 
relative to the ether, and their results do not depend on the velocity 
of the Earth’s motion with respect to the Sun. However, the con- 
traction hypothesis is not adequate for all experiments; new facts 
kept coming to light which agreed with the relativity principle and 
required additional hypotheses to explain them. This was, of course, 
an intolerable situation, and Lorentz stubbornly strove to show 
that many electromagnetic phenomena strictly, i.e. without neglect- 
ing higher order terms, do not depend upon the motion of the sys- 
tem. For this Lorentz had to show that for a body in uniform 
rectilinear motion (relative to the ether) the equations of electro- 
dynamics allow for solutions which in a certain way correspond 
to the solutions for an identical body at rest. Correspondence ‘s 
achieved by going over to new variables, x’, y’, 2’, and ¢’, with the 
help of the Lorentz transformations, as well as the introduction of 
new (primed) electromagnetic field vectors. The field equations do 
not change as a result of these transformations, and they have the 
same form for the old (unprimed) and new (primed) quantities. 
This property is known as invariance, in the present case in- 
variance of the electromagnetic field equations with respect to the 
Lorentz transformations. 

Today, with special relativity, we know that this is precisely the 
confirmation of the validity of the relativity principle in electrody- 
namics, though Lorentz did not consider the time ?’ to be the time 
in the moving reference frame; he called it local time and assumed 
that he was dealing simply with supplementary quantities intro- 
duced by means of a mathematical contrivance. In particular, the 
variable ¢’ could not be called “time” in the same sense as the vari- 
able ¢. In 1915 Lorentz reiterated the idea. He said that the main 
reason for his failure had been that he had always held that only 
the variable ¢ could be taken as the true time, while his local time 
t’ should have been regarded as no more than supplementary 
mathematical quantity. In Einstein’s theory, on the other hand, ¢ 
plays the same part as ¢. In 1927, a year before his death, Lorentz 
stated this even more definitely, saying that for him there was only 
one true time and that he regarded his time transformation as 
merely a heuristic working hypothesis. Thus, the theory of relativ- 
ity was actually the work of Einstein alone I may add _ that, 
having reread the works of Lorentz and Poincaré (70 years after 
their publication), it was only with difficulty, and already knowing 
the result (which, of course, greatly facilitates understanding) 
that I eeuld understand why the invariance of the electrodynamic 
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equations with respect to the Lorentz transformations, proved in 
these works, could at the time be regarded as proof of the validity 
of the relativity principle. 

Besides, Lorentz and Poincaré saw the principle simply as an 
assertion of the impossibility of observing uniform motion of a 
body with respect to the ether. It requires no special effort to pro- 
ceed from here to treating all inertial reference frames as com- 
pletely equivalent (that is the contemporary formulation of the 
relativity principle) only if the Lorentz transformations are under- 
stood as corresponding to going over to a moving frame of ref- 
erence. 

As we have seen, it was this that Lorentz definitely did not con- 
sider. Poincaré’s stand is less clear. In his paper of 1906 he simply 
asserts that the equations of electrodynamics “can be subjected 
to a remarkable transformation discovered by Lorentz, the signif- 
icance of which is that it explains why no experimental demon- 
stration of the absolute motion of the universe is possible’.* In my 
view, this “explanation” goes no farther than Lorentz’s. In general, 
Poincaré writes: “The results which I have obtained agree with 
those of Lorentz in all the principal points, and I have needed only 
to modify and augment them in certain details. These differences, 
which are of but minor importance, will be shown in later sec- 
tions.” ** On the other hand, some of Poincaré’s remarks in earlier 
works, papers and reports sound almost prophetic, notably regard- 
ing the need to define the concept of simultaneity, the possibility 
of using light signals for this, and his comments on the relativity 
principle. However, he did not elaborate on this, and in his works 
of 1905 and 1906 he followed Lorentz. As emphasized before, they 
strove mainly to show, and showed, under what assumptions the 
uniform motion of bodies relative to the ether remained undetect- 
able. But Einstein in his 1905 work reversed, one could say, the 
whole issue by showing that, having accepted the relativity prin- 
ciple and synchronized the clocks with the help of light (and also 
postulating that the velocity of light does not depend on the motion 
of the source), no additional hypotheses were required: the Lorentz 
transformations follow directly from these assumptions. The con- 
traction of moving rods and retardation of the rhythm of moving 
clocks can also be postulated from them. 

Thus, judging by published materials, Poincaré was apparently 
very close to enunciating the STR, but he failed to make the final 
step. We can only surmise why. Perhaps it was because he was 
primarily a mathematician, and it was therefore especially hard 


fy “C W. Kilmister, Special Theory of Relativity, Oxford, Pergamon Press, 
70 
** Ibid. 
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for him to rise (or descend?) to a clear understanding of such 
physically important aspects of the problem as adequate clarifica- 
tion of the meaning of all introduced quantities and concepts. 
Another similar hypothesis is that Poincaré was prevented by his 
predilection for convention, i.e. that school that emphasized (and 
overrated) the role of conventional elements and definitions in 
physics *. That canvention plays a part in the development of 
physical theories is indubitable. Length can be measured in 
metres, feet or some other unusual or way-out units. The same is 
true of time and other quantities, as well as of the definition of 
simultaneity: there is no uniquely preordained definition. But the 
end result, the content of a physical theory (as distinct from forms 
of notation, etc.) is not a matter of convention, it is determined by 
nature, a subject of investigation. Overestimation of the conven- 
tional element in knowledge may prevent the clarification of con- 
cepts. It could, in particular, explain why Poincaré failed to clarify 
the meaning of “true” time, ¢, and “local” time, ¢’, which are in 
fact equally true but are, if you care, “local” times for the frames K 
and K’, respectively. 





* As far as I can judge, these comments coincide with the view of Louis de 
Broglie, expressed in an address on the occasion of the birth centenary of Poin- 
caré: “It needed but a little, and Henri Poincaré rather than Albert Einstein 
would have been the first to enunciate the theory of relativity in all its general- 
ities, thereby giving French science the honour of the discovery . . However, 
Poincaré failed to make the decisive step, leaving to Einstein the honour of per- 
ceiving all the corollaries of the relativity pape and, in particular, through 
a profound analysis of the measurements of length and time, establish the real 
physical nature of the connection between space and time established by the 
principle of relativity. Why did Poincaré fail to pursue his conclusions to the 
end? He doubtlessly possessed an extremely critical mind, perhaps because as 
a scientist he was first and foremost a pure mathematician As mentioned before, 
Poincaré adopted a somewhat sceptical stance with regard to physical theories, 
holding that in general there existed an infinite number of logically equivalent 
points of view and pictures of reality from which the scientist, guided solely by 
considerations of convenience, chose one Such nominalism probably prevented 
him from conceding that amidst all logically possible theories some were closer 
to physical reality, or at least agreed better with the physicist's intuition and 
were therefore more useful That is why young Albert Einstein, who was only 25 
at the time and whose knowledge of mathematics was in no way comparable 
with the great French scientist's profound knowledge, was able. before Poincaré, 
to find the synthesis which at once removed all difficulties, using and justifying 
all the attempts of his predecessors. The coup de grace was dealt by a mighty 
intellect guided by a profound intuition of the nature of physical reality 

“Yet Einstein’s brilliant success should not let us forget that the problem of 
relativity had been earlier and profoundly analysed by the vivid mind of Poin- 
caré, and that Poincaré made a substantial contribution to the eventual solu- 
tion of the problem Einstein would never have succeeded without Lorentz and 
Poincaré” (L De Broglie, “Henry Poincaré, les théories de la physique.” Le liv- 
re du Centenaire de la Naissance de Henry Poincaré 1854-1954, Paris, 1955) 

I feel we must respect the point of view of de Broglie, whose attitude toward 
the memory of Poincaré was that of profound respect and maximum good-will. 
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I must stress, however, that such hypothetical reasoning, in this 
case as applied to Poincaré, is on the whole unjustified. There can 
be no doubt that Poincaré took an active part in the development 
of special relativity, and his contribution is indubitable. It is no 
more legitimate to ask why he failed to do Einstein’s work than it 
is to ask the same question concerning other physicists of the time: 
great works are great by virtue of the very fact that they are very 
difficult. 

In addition to what has been said of the part played by Ein- 
stein’s work, here is what he himself had to say, in a letter written 
two months before he died: “Recalling the history of the elabora- 
tion of the special theory of relativity, it can be stated conclusively 
that by 1905 its discovery had been prepared. Lorentz already 
knew that the transformation later named after him was of key 
importance in analysing Maxwell's equations, and Poincaré elab- 
orated that idea. As for me, I knew only of Lorentz’s important 
work of 1895, but not Lorentz’s later publication or the consecutive 
investigations by Poincaré. In this sense my work of 1905 was 
indepeident. The new element in it was the idea that the meaning 
of the Lorentz transformations went beyond the framework of Max- 
well’s equations and involved the essence of space and time. Also 
new was the conclusion that the ‘Lorentz invariance’ was a general 
condition for all physical theories. That was especially important 
to me because I had already realized that Maxwell’s theory did 
not describe the microstructure of radiation, and therefore did not 
always hold.” * 

So, the reader wishing to receive a simple answer may ask, 
after all is said and done, who developed the special theory of rel- 
ativity? As in most such cases, special relativity is not a dis- 
covery or result attributable solely to one person. However, most 
physicists (myself included) unequivocally credit Einstein with 
the principal role in elaborating it, for it was his work that con- 
tained an “entirely novel, and much more profound, understanding 
of the whole problem” [8]; it was “the last and decisive element 
in the foundation laid by Lorentz, Poincaré and others, on which 
the edifice could be built” (M. Born, Naturwiss Rundschau, 1956). 
First among those “others” is Larmor, who derived the Lorentz 
transformations back in 1900 (Vogt employed transformations very 
similar in form even earlier, as far back as 1887). 

There are other assessments of the part played by Einstein, Lo- 
tentz and Poincaré in elaborating the STR. And whereas extremist 
views in effect rejecting Einstein’s contribution cannot be treated 
seriously, more moderate statements such as “special relativity 


* Quoted according to Carl Raoay book, Albert Einstein (Russian edition. 
Moscow, 1966), the best biography of Einstein I have ever read. 
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was developed by Lorentz, Poincaré and Einstein” are, in the finat 
analysis, their authors’ affair: such things cannot be decreed, and 
no one has invented an instrument for gauging scientific merits 
with pharmaceutical precision. 

To avoid any misunderstanding, one more comment regarding 
the commonly used formula “Einstein’s relativity theory” is ap- 
propriate. It is a natural and legitimate formula, all the more so 
as it 1s by no means the same as saying “Einstein’s special theory 
of relativity”. For when we speak of relativity theory in general 
we mean both the special and the general theory of relativity. The 
general theory of relativity elaborates upon and advances special 
relativity and is generally regarded as an unsurpassed pinnacle 
of theoretical physics *. Max Born, for example, stated in 1955: 
“I have held and continue to hold that this is the greatest dis- 
covery of the human mind involving nature, which most remark- 
ably combines philosophical depth, intuition, physics and mathe- 
matical art. I admire it as I would a work of art.” Noteworthy is 
Einstein’s own remark in a letter to A. Sommerfeld written in 1912, 
when he was working on general relativity: “In comparison with 
this problem the initial relativity theory (i.e. special relativity. — 
V. G.) was child’s play.” From another letter of Einstein’s we know 
that “the period from the origination of the idea of the special 
theory of relativity to the completion of the paper which set it forth 
was five or six weeks”. It took Einstein eight or nine years (from 
1906 or 1907 to 1915-1916) to elaborate the general theory of rel- 
ativity, after which he continued to work on it until his death on 
April 18, 1955. To this should be added that the general theory of 
relativity is, more than any other theory in the history of science, 
the creation of one person, Albert Einstein. Finally, relativity theory 
emerged from the confines of the scientific community and reached 
the general public only in 1919, when the deflection of light pass- 
ing close to the Sun predicted by general relativity was actually 
observed. Hence, relativity theory as a whole can be legitimately 
associated with Einstein, and him alone. 

Finally, a few words about who was the first. In 1952, Max Born 
wrote to Einstein from Edinburgh: “The elderly mathematician 
Whittaker, with whom I am friendly, and who resides here as 
honorary professor, has prepared a new edition of his old book 
A History of the Theories of Ether and Electricity, the second vol- 
ume of which has already appeared. It includes, among other 
things, a history of the theory of relativity, with the peculiarity 


* As space does not permit me to go in more detail into the place of general 
relativity in the development of physics, I could refer the inquisitive reader to 
my article “The Heliocentric System and the General Theory of Relativity (from 
Copernicus to Einstein)”, which appeared in the Einstein Collection 1973, in 
Russian, Moscow, 1974. 
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that its discovery is ascribed to Poincaré and Lorentz, whereas 
your work is mentioned as secondary. Although the book comes 
from Edinburgh, I am not really afraid that you may imagine that 
I am behind it. In fact, for three years I have been doing al! I can 
to dissuade Whittaker from his intention, which he cherished for 
a long time and which he liked to advertise. I reread the old orig- 
inal papers, including some of Poincaré’s incidental works, and 
provided Whittaker with English translations of the German 
works... . But all was in vain. He insisted that everything of sub- 
stance could be found in Poincaré’s work and that the physical 
interpretation was obvious to Lorentz. But I know how sceptical 
Lorentz really was, and how long it took for him to become a 
‘relativist’. I explained all this to Whittaker, but unsuccessfully. 
This angers me, because he enjoys great prestige in the English- 
speaking countries, and many will believe him. It is especially un- 
pleasant to me that he makes all kinds of references to partial 
communications regarding quantum mechanics in such a way as 
to especially praise my role in it. So that many (if not you your- 
self) may imagine that I am in some unsavory way involved in 
this thing.” * 

Einstein’s answer was: “Dear Born, don’t give any thought to 
your friend’s book. Everyone behaves as seems to him right, or, 
expressed in deterministic language, as he has to. If he convinces 
others, that is their problem. At any rate, I found satisfaction in 
my efforts, and don’t think it is sensible business to defend my few 
results as ‘property’, like an old iniser who has laboriously gath- 
ered a few coins for himself. I don’t think ill of him, to say nothing, 
of course, of you. And I don’t have to read the thing.” ** 

The answer is very typical of Einstein, and it can clarify much 
for those who are unfamiliar with his life. Actually, it explains the 
main thing: the “secret” of his exceptional popularity in the mod- 
ern world. The fact that he was the greatest of the great physicists 
of our, and not only our, age, is fundamental, but it is not all. 
Einstein was also a champion of justice, freedom and other human 
tights, he despised the forces of evil and offered an example of 
nobleness and lofty human dignity. It is simply impossible to 
imagine Einstein engaging in arguments about, still less bickering 
over, priority. The same is true of Lorentz and Poincaré. Lorentz, 
who contributed so much to the development of specia! relativity, 
gave the credit of its enunciation “solely to Einstein”, and noted 
Poincaré’s contribution. The latter spoke highly of Lorentz’s role. 
Einstein stressed the contributions of Lorentz and Poincaré. It 


* Albert Einstein, Hedwiga und Max Born, Briefwechsel, 1916—1955 (letter 
‘dated 26 September, 1952). 
** Ibid (letter of 12 October, 1952), 
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could be suggested that Poincaré did not consider Einstein’s con- 
tribution to be so great and perhaps even felt that he had “done 
it all” himself. But actually we can only speculate about what Poin- 
caré felt from what he did not say rather than any complaints he 
ever voiced.* 

So far we have spoken only of the initial works of Lorentz, Poin- 
caré and Einstein. It is to be hoped that this is adequate for a com- 
parison of their relative significance. I should like to conclude by 
emphasizing that, naturally enough, work carried out after 1905 
also contributed to the elaboration of special relativity. We cou!d 
note papers by Einstein himself, as well as by Max Planck and, 
especially, Hermann Minkowski (his four-dimensional interpreta- 
tion of the theory proved extremely fruitful). 

Il. The unsuccessful search for a medium for the propagation 
of light. Light phenomena in vacuo play a special part in the spe- 
cial theory of relativity. The speed of light in vacuo is the limiting 
speed with which signals can be transmitted, and it is right to say 
that the history of relativity theory begins with the discovery that 
the speed of light is finite. 

As indicated in § 1.8, the theory of relativity proceeds from the 
consideration that the propagation of light (electromagnetic waves) 
requires no material medium; in other words, light can travel! 
through vacuum. The idea entered physics with great difficulty, 
and it is associated with the special theory of relativity. Today it 
is part of the ABC of physics. The abandonment of the notion of 
a “luminiferous medium” under pressure of experimental facts !s 
an extremely instructive page in the history of physics which is 
worth dwelling upon. We must, however, begin from afar and 
briefly recall the development of notions regarding the nature of 
light. In the early 17th century two points of view on the nature 
of light appeared, neither of which has lost its significance to this 
day. One, the “corpuscular”, belongs to Newton; the other, the 
“wave”, belongs to Huygens. Newton's initial premise can be rea- 
dily appreciated: the success of his mechanics required a mechan- 
ical interpretation of light. Newton held that light represented the 
motion of special material particles called corpuscles. The basic 
properties of light — propagation in a straight line through a ho- 
mogeneous medium, the laws of reflection and refraction — can be 
easily explained in terms of the corpuscular picture. In a homoge- 
neous medium no forces act on the corpuscle, and it moves by 
inertia, i.e. in a straight line. Reflection occurs according to the 
law of elastic impact (like a billiard ball striking the cushion of 
the table). The angle of incidence equals the angle of reflection, 

* It seems remarkable that there is not a single mention of Einstein's work 


on special relativity in any paper of Poincaré’s although he died seven years 
after it was enunciated. 


Supplement 329 





which is precisely the law of reflection of light from a stationary 
surface. 

If a corpuscle at the interface of media / and // is subjected to 
forces normal to the interface in the direction of the denser me- 
dium, it changes the direction of motion. Indeed, let the velocity 
components of the corpuscle in medium / be V, and V;. The forces 
acting at the interface augment V,, the direction of the velocity 
changes, and “refraction” occurs (Fig. S.1). 





(a) (b) 


Fig. S.1. (a) The change in the direction of motion of a corpuscle in crossing 
the interface of two media / and // After crossing the interface V,/ > V,, but 


Vj = V1, and the direction of the velocity changes (6) Refraction of ‘ight cros- 
sing the interface of two media. 


We know from geometrical optics that when light passes from 
a medium with a refractive index n, into a medium with a refrac- 
tive index nz refraction occurs. The connection between the angle 
of incidence é and the angle of refraction r is given by the Descar- 
tes-Snell law: 

sini fe 
‘sinr my 
Qualitatively, Newton’s reasoning explains the refraction of light. 

Newton also knew that light possessed properties which fittec 
with difficulty into his scheme (recall “Newton’s rings” in optics, 
a typical interference phenomenon), but he devised ingenious ex- 
planations to maintain the corpuscular picture. At about the same 
time, Huygens put forward the idea of the wave nature of light. 
He proceeded from an analogy with sound waves, although he had 
no idea of the nature of light waves. He knew that light could 
travel where sound could not (if you place a bell under a trans- 
parent dome and evacuate the air you can see the hammer strike 
the bell but do not hear the sound). Huygens (and all physicists 
after him until Einstein) could not imagine vibrations propagating 
in the absence of any medium. So it was essential to introduce a 
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special medium through which light waves could travel. Huygens 
called it the ether. Thus appeared a concept the fallacy of which 
was revealed only by the theory of relativity. 

Newton rejected the wave theory of propagation of light. He 
based his reasoning on the phenomenon of double refraction in 
crystals. Newton showed that if light propagates as waves then 
double refraction indicates a preferred direction of vibrations in 
the beam. But in Newton’s time orfly longitudinal waves were 
known, which do not possess this property. So he rejected the 
wave theory, although he conceded that it was plausible. Newton 
also categorically rejected the ether *. 

The corpuscular view of the nature of light was dominant for 
a hundred years after Newton's death. Its success was to a con- 
siderable degree due to his prestige. But a theory is good only as 
long as it does not contradict facts and explains them. The begin- 
ning of the 19th century brought with it the discovery of pheno- 
mena which offered convincing testimony of the wave nature of 
light. Light interference and diffraction were thoroughly investig- 
ated, and rectilinear propagation was satisfactorily explained on 
the basis of the wave theory. The discovery of polarization indi- 
cated that light waves were transverse. 

Thus, the 19th century ushured in the triumph of the wave theory 
of light There seemed no doubt at all that light was a wave proc- 
ess. But 19th-century physicists could not imagine oscillations in 
the absence of some bodies or some medium. There had to be one. 
Its name — the ether — had already been coined by Huygens; it 
remained to establish its physical properties. Nineteenth-century 
physics was dominated by the ideas of mechanics, arid it is hardly 
surprising that ether was endowed with the mechanical properties 
of a solid (transverse vibrations can propagate only through elastic 
solids). It was, of course, a queer solid indeed: it could not be 
sensed in motion, was invisible, could not be touched; but neither 
could it be endowed with other properties without coming into 
contradiction with observations. 

But even setting aside the difficulties with the properties of the 
ether, another pertinent problem was that of a frame of reference 
with which the ether could be associated, i.e. the system in which 
it was at rest. Obviously, it would have to be a preferred system 
over all others, at least as far as optical phenomena were con- 
cerned. Here, nature itself seemed to offer a direct answer to the 
question, if the aberration of light was taken into account. The 
phenomenon consists in the following. If a ray of light is observed 


* An idea of the debate between Newton and Huygens can be found in The 
Evolution of Physics, by A. Einstein and L. Infeld, New York, 1942. 
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from two reference systems moving relative to one another it will 
be seen at different angles to some direction common to the two 
systems (for instance, the direction of the relative velocity). If we 
observe the beams through a telescope, the visible direction coin- 
cides with the axis of the telescope. But why does the motion of 
the observer (telescope) affect the apparent direction of the in- 
cident light? This can be explained with the help of a simple 
example. Let a bead be falling vertically with a uniform velocity c. 
We want it to pass through a pipe of length / moving horizontally 
with a velocity V without hitting the walls. The way to achieve this 
is by keeping the bead on the pipe’s axis BB’. For this, when the 
bead reaches point B’, the lower end of the pipe B should arrive 
there at the same time. Obviously, the pipe should be tilted forward 
in the direction of the motion. The angle of inclination @ to the 
vertical is easily determined. Let the bead travel the distance 
BB’ = I cos q in time t. In the same time, the end B of the pipe 
must travel the distance BB’ =J/sing. But /cos@ = cr, and 
{sing = Vr, whence tang = V/c (Fig. S.2a). 

In the corpuscular theory, the corpuscles play the part of the 
bead. Consequently, the telescope must be tilted forward in the 
direction of the motion. But the same reasoning holds in the wave 
theory of light. To keep the moving pipe from “indenting” the light 
wave front (the velocity of the pipe is V and that of light is c) it 
must be tilted at the same angle g, such that tang = V/c 
(Fig. S.26). 

The aberration angle is defined as the change of the apparent 
angle at which the incident ray is observed in passing from one 
inertial frame of reference to another. Obviously, it is impossible 
to detect the aberration angle within one inertial frame, because 
the direction of the beam (toward a distant star) is always the 
same. Nevertheless the aberration of light was discovercd by ob- 
serving stars from the Earth, because the Earth moves .n an el- 
lipse, hence it is the same inertial reference system only over a 
limited time interval. Every six months the Earth reverses its di- 
tection of motion, and therefore the apparent sighting of the star 
should change. 

The English astronomer James Bradley was looking for the pa- 
rallactic shift of stars: the apparent path traced by a star over a 
year due to the change in the position of the observer. Fig. S.2c 
explains the appearance of the apparent parallactic shift of the 
North Star. In the course of a year it should describe a small 
ellipse occupying a very specific position relative to the orbit of 
the Earth. 

In 1728, while trying to detect the parallactic shift, Bradley 
discovered the aberration of light; he found that stars lying near 
the pole of the ecliptic indeed describe an ellipse the major semi- 
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axis of which is equal to 41’. However, the ellipse did not lie as it 
should have for parallactic shift (Fig. $.2c). 

Fig. S.2d shows how the aberration of light affects the apparent 
displacement of a star. Star s, lying perpendicular to the plane of 
the Earth’s orbit, is observed from two diametrically opposite pu- 





Wave L 
7 front , 


Fig. $.2. (a) A vertically falling bead musi pass through a tube moving hori- 
zontally with velocity V. (6) A wave front musi pass without distortion ifrough 
a tube moving horizontally with the velocity V (c) The appearance of the parai- 
lactic shift ors star ai the pole of the ecliplic due io the motion of the Earth 
Thanks to this effect, in the course of a year the star describes a small ellipse 
Allention must be given to the observer's location on the Earth; an observer at 
point A, for example, sees slar A on the celestial sels (the position of the star 
is also denoted by A) (d) The aberration of light and the Earth's tnotion in 
orbit also cause the star ai the pole of the ecliptic to describe an ellipse in the 
course of the year However, the respective positions of the Earth and the star 
differ from those shown in (c). This is the difference between aberrational and 
parallactic shifis The diagram in the frame shows the appearance of the aber- 
ration angle when the Earths Motion reverses (six months later) 


sitions A and C. Over six months the angle of the direction toward 
the star from these two points varies through 2g. In Fig. S.2d, 
angle @ is the angle between the direction at which an observer 
would see the star s from a stationary Earth (which, of course, is 
impossible), and the apparent direction to the star s. Six months 
later the same angle will be in the opposite direction, and the 
difference between the apparent directions to the star s is 29. We 
shall use a simple calculation to evaluate angle go. Light from the 
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North Star tally normal to the plane of the Earth's orbit. The 
motion of the Earth is perpendicular to the direction of the ray. The 
velocity of the Earth is 3 10‘ m/s, the velocity of light is 
3X 10° m/s. Hence, p = arctan V/c = 20.5”; 29 = 41”. This was 
the value Bradley obtained. He realized that he had failed to detect 
the parallax of a fixed star (it was discovered a hundred years 
later by Bessel) and instead discovered the aberration of light. 
Bradley explained it on the basis of the corpuscular theory, which 
we presented here. But the same result obtains for the wave theory 
as well. Thus, the explanation of aberration posed no difficulty for 
the wave theory. It did, however, involve an inevitable corollary 
regarding the “luminiferous ether”. It had to be assumed that light 
travels through a medium at rest with respect to the heliocentric 
system, otherwise it would not impinge normal to the plane of the 
Earth’s orbit. 

It thus followed from the aberration of light that the ether was 
stationary in a heliocentric frame of reference (Newton’s absolute 
system). The heliocentric system thus turns out to be a preferred, 
privileged one with respect to the propagation of light. Let us not 
forget that up to the mid-nineteenth century no one knew that light 
waves were electromagnetic waves of a specific frequency. 

If we assume the existence of a luminiferous medium, its role 
differs in no way from that of any material medium transmitting 
vibrations. If the propagation speed of vibrations in a reference 
system in which the medium is at rest is v, then in any other sys- 
tem moving relative to the medium with the speed +V the propa- 
gation velocity of the oscillations will be c+ V. Under any wave 
theory the velocity of the waves is independent of the motion of 
the source, but it depends upon the motion of an observer relative 
to the medium the oscillation of which produces the waves: in our 
case the ether. Thus phenomena depend not only on the relative 
velocities of bodies, but also on their velocities relative to the me- 
dium 

These assertions are best explained with the example of the Dop- 
pler effect for sound waves in air. Let the air be at rest in a sys- 
tem K where the velocity of sound is v; the source is moving re- 
lative to K (i.e. the air) with a velocity V, and the observer is at 
rest in the system (Fig. S.3a). Let us attach to the source a sys- 
tem K’ (motion toward the observer) or K” (motion from the ob- 
server). We can now reproduce the reasoning in § 3.4. Pulses are 
sent from the source at intervals 7’ or T” (2n/T’ and 2n/T” are 
the proper frequencies wo). The receiver picks up two consecutive 


signals at intervals T=T’ a =—T" (1 _ ~) in the first case, 


and T=T” + 7 rv(i + +) in the second. Thus, if the source 
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is moving relative to the observer and the medium, when it ap- 
proaches the observer the latter will observe an increase in the fre- 


quency o=o,/(1 —+), and when it recedes, a decrease o= 


V 
=o,/(1 +). 

If, now, the source is at rest relative to the medium, while the 
observer is receding from (or approaching) it, then the propaga- 
tion velocity of the vibrations relative to the observer will be v — V 
or 1 + V, respectively. Now the source is at rest in K, and the 


Ki i‘ K ua XK \" 
| { 
Pt 7 
V » Vv V vo 
/ v 2 Observer at Observer Ofserver 
Source | Source rest in Kk Source at 
Medium at rest ink rest tn 
(a) (b) 


Fig. S.3. Illustration of the derivation of the Doppler effect formula for sound. 

(a) The observer and the air are at rest in the K system, and the source 1s mov- 

ing through the medium with the velocity V. (6) The source of the sound and 

the air are at rest in the K system, and the observer is moving through the air 
with the velocity V 


observers are connected with the systems K’ and K’’. Signals are 
emitted from the source at intervals 7; they will reach the observer 


in K’ atintervals 7’ =T+ =r =( 7 ‘ig and the observer 
] 





in K” at intervals Ponsa ( LU 7 P Only now wy = 





1 — 
== 2n/T, whence in receding we obtain a reduction in frequency 
o=0(1 _ ~), and in approach an increase, o=a/(I +). 


We find that the equations are different for the same relative 
velocity of the source and the observer. Similarly to what was 
done in § 3.4 for the case of wave radiation at an angle to the 


direction of propagation, we obtain o = wp (1 = ~ cos 6) and o= 


ja cove 
v 


This example readily demonstrates that the ether assumption 
violates the relativity principle. Of course, the relativity principle 
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is valid in the presence of a medium, but to make the conditions 
of the experiment identical, the velocity of the reference system re- 
lative to the medium must be the same. In other words, every re- 
ference system must be associated with a definite medium. Thus, 
for the relativity principle to hold the ether would have to be en- 
trained, or partially entrained by the reference system (the ether 
drag). This is a strange assumption, but nevertheless it was de- 
veloped, though in a different connection. 

If the hypothesis of the stationary ether were correct it could 
explain other optical phenomena. From this point of view, Fizeau's 
experiment (1851) led to some very mysterious results We already 
described the experiment in § 3.5, but let us take a look at it froi 
the point of view of a physicist of the 19th century. Let the water 
be at rest in the system K’, which is moving together with the 
water with velocity V relative to the laboratory. Let us also as- 
sume that the laboratory can be considered the preferred reference 
frame in which the ether is at rest. Light propagates through the 
ether The substance changes its phase velocity, but the velocity 
of the substance does not matter. Consequently, the velocity of 
light in the laboratory system, v, is simply equal to the velocity 
of light, v’, in the stationary water. Of course, v’ = c/n. Assume 
for a moment that the ether is “dragged” together with the water. 
Then, naturally, the velocity of light, v’, would be compounded 
with the velocity of the ether, ie. the water, and we would get 
v = v’+ V. An ether endowed with the strange property of “par- 
tial” drag would give v = v’ + RV, the sign depending upon the 
relative direction of motion of the light and the medium. Fizeau’s 
experiment, which was subsequently repeatedly confirmed, gave 


the result v=o’ + (1 -—) V. In any event, the stationary ether 


hypothesis contradicted the results of Fizeau’s experiment. 

The battle for recognition of the ether did not end with this, of 
course, but before describing further attempts to detect the ether 
it is useful to dwell on the relationship between physical experi- 
ment and theory. Cognition of nature involves the search for laws 
that correctly reflect existing relationships or, philosophically 
speaking, the search for objective laws of the objective world exist- 
ing independently of us. Nature is studied by people, and they in- 
variably introduce their subjective notions and sensations, to say 
nothing of inevitable errors. Consequently, the laws of nature must 
be verified. What is the criterion of correctness of a physical law? 
The correctness of physical laws is revealed in practical activity. 
The guarantee that our knowledge agrees with the laws of nature 
lies in their verification by different people in different places, and 
repeated verification by one person The most important means of 
verifying physical laws and discovering the laws of nature is by 
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artificially creating the necessary conditions, i.e. by staging a 
physical experiment. 

But physics cannot be restricted to experimental results. Physics 
is impossible without theories that can be used to systematize and 
explain various natural phenomena on the basis of a small number 
of fundamental laws. In turn, physical theory — and today it is a 
science in its own right, theoretical physics —is closely linked 
with mathematics. When experimental material accumulates, a 
theory appears to explain a specific set of phenomena. Can an ex- 
periment or series of experiments vindicate or refute a theory? We 
are not, of course, speaking of erroneous experiments, which may 
always occur; ultimately their erroneousness is sure to be shown. 
Sometimes in developing a theory the limits of its application are 
apparent, and one should keep within them. With these reserva- 
tions, the following can be asserted: if a single correct experiment 
carried out within the limits of applicability of a given theory con- 
tradicts that theory, the theory must be assumed wrong. As for 
“proof” of a law or theory through comparison of obtained con- 
clusions with experimental data, no amount of experiments agree- 
ing with a theory can be regarded as the ultimate proof. A theory 
exists and is considered correct as long as its conclusions do not 
contradict some new experiment within the domain of its applicabi- 
lity. The situation is analogous to one of the rules of mathematics: 
the validity of special cases is not proof of the validity of a general 
theorem, but one example to the contrary refutes it. 

Physics is essentially an experimental science. The process of 
cognition of nature, of which physics is a part, is continuous and 
unlimited. The question of establishing the ultimate truth is, rather, 
a philosophical issue than one of some specific science. Individual 
physical experiments reveal various specific laws and regularities 
or, from the philosophical point of view, relative truths. Although 
the regularities contain elements of absolute truth, they do not 
provide exhaustive knowledge. In the final analysis, every theory 
is either restricted or erroneous. At each stage a theory’s validity 
is determined by the absence of experimental data contradicting 
it, while its value lies in its ability to explain and predict ob- 
servable phenomena. These remarks, interesting in themselves, 
need not have been cited if they were not useful in setting forth 
the history of the ether. We shall rely mainly on “negative” experi- 
ments pointing to the fallacy of various premises postulated in an 
attempt to salvage or discover the ether. 

Let us return to the assumption that the ether is stationary in 
a heliocentric system (which follows from the observed aberration 
of light). If the ether is stationary in a heliocentric system, the 
Earth in its motion around the Sun should experience an “ether 
wind’. It is not hard to conceive an experiment to discover motion 
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relative to the ether. It is shown graphically in Fig. S.4. Two 
photoelectric cells are sitting on an optical bench parallel to the 
Earth’s velocity in orbit. Halfway between them is a light source, 
I, which emits flashes of light. Light travels through the stationary 
ether with the velocity c. But cell C, is moving toward the light 
beam, and cell C, is receding from it. Light travels toward C, with 
the velocity c+ V, and toward C2 with the velocity c— V. Thus, 


I 
Oo cv Ny cv of 
qe» —“— 
7S 


—> V/V 

Fig. S.4. A light source, J, and two photoelectric cells, C,; and C2, are mounted 

on an optical bench. The bench is positioned parallel to the direction of the ve- 

locity of the Earth in its orbit. If light propagates in the stationary “ether” it 

travels from the source / toward cell Cy with the velocity c+ V, and toward 
cell C2 with the velocity c— V. 


C, will register the arrival of the light beam before C2, the time 
interval being 


l t V u 
sy apy a (S.11.1) 
c? 


Before considering the possibility of carrying out such an exper- 
iment, note the following. In speaking of the velocity of light, the 
ether assumption reduces solely to the assertion that there exists 
a single reference frame in which the velocity of light in vacuo 
is c. It is the one and only preferred system. In ali other frames 
moving relative to it with velocity V, the velocity of light in vacuo 
is given by the classical rule of addition of velocities, c’ = c+ V. 
Our reasoning applies to a reference frame K connected with the 
ether (or simply with the frame in which the velocity of light in 
vacuo is c). It is assumed here that the signals received by the 
photoelectric cells are registered in the K frame. But if we apply 
the same reasoning to a frame K’ in which the optical bench is at 
rest we obtain exactly the same result. Simply, in that frame the 
velocity of light in vacuo differs from that in K: c—V to the 
right, and c + V to the left (Fig. S.4). In the K frame, we recall, 
c—V and c+ V would be simply the velocities with which light 
reaches cells C, and C,*. It is not surprising that the resuits 
are the same in K and K’, since in classical mechanics the timing 
of events is absolute. 

Thus, if we could measure the time difference Af (S.II1.1) we 
would thereby only prove the difference in the speed of light in 


* It is worth remembering that according to the STR the velocity of light in 
the K’ frame is the same as in K, and equa! to ¢. 
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frames K and K’. That, of course, would be indirect proof of the 
ether (see further on). 

As to the order of magnitude of Af in such an experiment, if we 
put, for example, / = 100 m, and V ®% 3X 104 m/s (the speed 
of the Earth on its orbit), then At ~ 10-!° s. Even modern hard- 
ware cannot measure such a minute time interval. 

In 1878, Maxwell suggested an experiment employing the phe- 
nomenon of interference of light to detect the motion of the Earth 
relative to the ether, at rest in a heliocentric system (if light ac- 
tually propagated through it). Maxwell thought the required accu- 
racy of measurement to be unattainable, but three years later Mi- 
chelson built an interferometer capable of detecting the motion of 
the experimental set-up relative to the stationary ether. 

Michelson’s experiment, carried out in 1881 *, was as follows 
(Fig. S.5). A beam of light from a source, /, is directed on a half- 
silvered glass plate, P. Half the incident light is reflected, the 
other half passes through the glass. Michelson’s instrument (today 
it is called Michelson’s interferometer) had two mirrors, S; and So, 
located as shown in the drawing, at distances L,; and L2 from the 
half-silvered glass P. All the parts of the interferometer were firmiy 
secured to a heavy block of stone floating on a disc of wood in a 
tank of mercury so that the whole system could be turned smoothly. 

At the plate P the beam is split in two: beam / travelling to 
mirror S;, and beam 2 to mirror S,. Each beam reaches its mirror 
and returns to plate P. As the plate is semi-transparent, a portion 
of the light from both beams travels in direction 3. Since each of 
the beams / and 2 is a portion of the initial beam, the two rays / 
and 2 travelling in the direction 3 are coherent and can interfere. 

Let us determine the time it takes the beam of light to travel 
from P to Sp and back. The interferometer is at rest in a system 
which is travelling with a velocity V relative to the ether. The 
distance between P and Sz is Lo, the speed of light to the right is 
c— V, and to the left, c-+ V. Hence, the required time is 

Le L2 2Le 1 
b= c—V + cC+V V oe ae, V2 (S.II.2) 





ec 
Let us now find the time 2, it takes beam / to travel from P to 
mirror S;. In the time ¢, mirror S,; travels the distance Vz,, and 
the light travels a distance ci, along the hypotenuse of triangle 
PS{P”. From this right-angle triangle it follows that (ci) = Li+ 
+ (Vi,), whence 


fa ee ue 


—=—=—_— —_—-, B é 
vc? — V2 ¢ 4/1—B c 
* See A. A. Michelson and I. W. Morley, Phys. Mag. (5) 24, 449 (1887) — 








Light source 


(8) 





Fig. S.5. (a) Michelson’s interferometer (6) Diagrammatic representation of Mi- 
chelson’s experiment. A beam of light from the source / is split on the semitrans- 
parent plate P into two beams, / and 2, travelling along and perpendicular to 
the direction of the Earth’s motion in its orbit. The velocity of the Earth's orbi- 
tal motion is indicated by the arrow and the letter V. Beams / and 2 reflect 
from mirrors S, and Sz, respectively, and return to plate P. After reflecting and 
refraciing, two beams travel in the direction 3. That 1s the direction in which the 
interference pattern is observed. The decisive element of ihe experiment consists 
in turning the whole apparatus through 90°, beam / now travels in the direc- 
tion of the Earth's motion, and beam 2 normal to that direction. If light propa- 
ia through the stationary ether the differance in the paths of beams / and 

would change and the interference pattermobserved in direction 3 would 
change (the interference fringes would shift). The experiment, however, dis- 

covered no shift in the interference fringes. 
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The total time it takes the light beam to travel from P to S, and 
back is double that amount, so that 


= 2L j 
eS ae 


The difference between times t; and ¢, is due not only to the 
difference in the lengths of the arms of the interferometer but also 
to the motion of the apparatus. 

One of the arms of the interferometer is aligned parallel to the 
motion of the Earth along its orbit (which is known from astro- 
nomical observations). But we live on Earth which, according to 
the assumption, is moving relative to the ether, hence the experi- 
ment is inevitably conducted in an “ether wind”. Hence, too, it is 
impossible to compare the interference pattern “without the ether 
wind” and “in the ether wind”. By aligning, for example, arm PS, 
in the direction of the velocity V, we obtain a certain interference 
pattern: an alternation of light and dark fringes *, depending on 
the difference in the propagation of beams / and 2: 


(S.11.3) 


Mt=— 4 =2 (4 — For)= 
=n (Geer Res — Li). (S.I1.4) 


This time the difference depends on both L, and Le and the ve- 
locity V. If the condition L; = Le could be guaranteed the differ- 
ence would depend on L, = L2 = L in this way: 


(A), 02, =—= (Foe - TF) (S.11.5) 


When L is known, the interference fringes obser,ecd in the appara- 
tus are uniquely determined by its motion. But, first, it is simpler 
not to worry about assuring the condition L; = Le, and secondly, 
it is more convenient to observe the change in the interference pat- 
tern, that is, the displacement of the fringes. For that the whole 
set-up ts rotated through 90°. The arms of the interferometer change 
places. We then obtain 

pean Pee 

{ . TOB?’ lo A eg (S.II.6) 
AY =h—t=2 


(Gree ~ oe) - arr (4-725): 





* On Michelson’s interferometer see, for example, the already quoted Opfics 
by G. Landsberg. 
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This means that in the rotation of the apparatus the time difference 
changes as follows: 


nN ape as Nt 
Aft = At! — At == (! yoR)* 


an! the fringes shift. 

if we want to obtain a difference in the distance travelled by the 
peains of the order A, A‘* must be of the order of the vibration pe- 
riod, T. But T=A/c. For the motion of the Earth in orbit B = 107+; 
for visible light A = 5X 10-5 cm, whence the total length of the 
interferometer arms L; ~- Lz ~ 50 metres. Such a path for the 
light beam can be obtained by repeated reflection. 

Michelson could have detected an “ether wind” blowing at 
10‘ m/s. Together with all other physicists, he had not the slightest 
doubt that such an “eiher wind” would certainly be felt. But there 
was wone. The experiment was repeated many times, with greater 
and greater precision: today an “ether wind” blowing at 30 m/s 
could be detected, but Michetson’s result, or his nuli result, as it is 
also called, stands fast. There is no doubt that it is correct. 

However, it does not necessarf{ly follow from Michelson’s ex- 
periment that there is no ether. Its results can be explained by 
endowing the ether with certain properties. The coup de grace 
against the ether required other observations. But ‘et us first draw 
some conclusions from Michelson’s experiment, without linking it 
with the search for the “ether wind”. The experiment showed that 
a rotation of the interferometer on Earth does not cause a shift in 
the interference fringes. In principle, however, such a shift could 
be associated with the difference in the speed of light along the 
two directions in the frame of reference of the interferometer. 

Thus, regardless of whether the ether exists or not, Michelson’s 
experiment shows that the speed of light over a closed path meas- 
ured on Earth is the same in all directions, i.e. it is isotropic. As 
the Earth moves around the Sun along a closed curve, and can 
therefore be regarded as an inertial frame of reference only over 
a short time interval, our measurements are actually carried out 
in many inertial reference frames. It thus follows from the expe- 
riment that the velocity of light along a closed path is isotropic in 
any inertial frame of reference 

A somewhat modified form of Michelson’s experiment, staged in 
1932 by Kennedy and Thorndike, offers confirmation of Einstein's 
main postulate. From equation (S.II.7) it is apparent that A? also 
depends upon c. If the absolute velocity of light in vacuo were 
different in different inertial frames of reference, then a shift in the 
fringes would be observed in passing from one IFR to another. The 
interference pattern was on several occasions observed continuously 
for periods ranging from eight days to one month (with a three- 


Lit bs Be (5.11.7) 
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month interval). During that time the apparatus changed many 
inertial frames of reference. It is so sensitive that a variation of 
2 m/s for c could be detected. But nothing happened. It thus follows 
from Michelson’s experiment that the speed of light has the same 
value (travelling there and back) in all directions within a given 
IFR; moreover, it has the same value in vacuo in all inertial 
frames of reference. 

At the risk of interrupting our discourse, it is worth presenting 
the results of Michelson’s experiment in terms of the special theory 
of relativity. If the experiment is staged in an inertial frame of 
reference its result is obvious. In every such system the velocity 
of light in vacuo is isotropic and the interference pattern observed 
fn the experiment is due solely to geometric differences in path. In 
short, in every IFR in which the interferometer is at rest we obtain 
exactly the same picture as we would have in a classical consider- 
ation in the preferred reference system in which the ether is 
Stationary. 

The need for a more complex interpretation of the result arises 
when the experiment is considered in an inertial reference frame 
relative to which the interferometer is in motion. Let the inter- 
ferometer be at rest in K’ and the experiment be staged in K; for 
the sake of simplicity we assume that in K’ Lio = Ly. We have 
added the subscript 0 to stress that we are dealing with proper 
lengths. Obviously, equations (S.II.4) and (S.II.6) remain valid, 


but Lo= Ly» 1 — B*, and L; = Lio. It is apparent then that At, 
determined according to (S.II.6), vanishes. Naturally, a rotation 
of the apparatus yields no effect. Thus, in special relativity the null 
result of Michelson’s experiment is explained by the relativity of 
the length of the measuring rods (which is most directly linked 
with the invariance of the velocity of light, § 2.3). 

To explain the result of Michelson’s experiment and salvage 
the ether, Lorentz and Fitzgerald assumed that in motion relative 
to the stationary ether all bodies contract by the factor 4/1 — B? 
in the direction of motion (the “Lorentz contraction”). It is ap- 
parent from the foregoing reasoning that such an explanation is 
possible. But it is essential to emphasize the difference in the rela- 
tivity of the length of measuring rulers in special relativity and in 
the Lorentz contraction. In the STR the contraction is a conse- 
quence of measurements made in the relative motion of reference 
frames. With Lorentz-Fitzgerald it is the consequence of motion 
relative to the ether, which retains the preferred reference system. 
The fallacy of the Lorentz contraction is revealed in a modification 
of Michelson’s experiment. Michelson’s interferometer had arms 
of the same length; Kennedy and Thorndike’s had arms of 
different length. If the interferometer arms are different, a shift 
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in the interference fringes should be observed when the interfero- 
meter’s velocity relative to the ether changes. But on Earth an 
interferometer participates in three motions relative to the ether: 
the Earth’s motion relative to the Sun, its rotation, and finally, the 
motion of the Sun. The resultant velocity varies by a certain value 
every 12 hours (and six months). These variations should result in 
a shift in the interference fringes. Indeed, if Lorentz’s hypothesis 


is correct, then Ly= Loy ~/(1 — B’), and from (S.I1.4) we get 


A Ey. (S.11.8) 


2 
_ evi — B? (Loo — 


Without rotating the apparatus, let us see how Af? varies with 
a variation of the velocity B relative to the ether by AB. Differen- 
tiating (S.11.8) with respect to B, we obtain 

At 


a a: _ ee — 20 — Lio 2 
ea (hen Lio) 4(=s) "2 AB?,  (S.I1.9) 


The latter equation is written to the accuracy of B*, but prolonged 
observations of the interference pattern revealed no variations. 

Another way to square the ether with the results of Michelson's 
experiment is to assume that the ether is “dragged along” by 
moving bodies. But as we have seen, the aberration of light 
“agrees” only with a “stationary ether in a heliocentric system”. 
An experiment specially staged by Fizeau (see § 3.5, where it is ex- 
plained in terms of special relativity) to determine the ether “drag” 
led to the conclusion that there was a “partial drag”. 

Of course, the mentioned experiments and observations far from 
exhaust the attempts to establish the properties of the ether. But it 
is already obvious that the ether would have to be endowed with 
extremely contradictory properties. But the ether’s most significant 
“contribution” to physics would probably have been the rejection 
of the relativity principle in electrodynamics. 

In 1905, Einstein’s paper, On the Electrodynamics of Moving 
Bodies, appeared. It virtually set forth the whole of the special 
theory of relativity, which not only offered a natural explanation 
of the result of Michelson’s experiment but also correctly interpret- 
ed all known mechanical, electrodynamic and optical phenomena. 

From the outset it extended the relativity principle to all physics 
and unambiguously asserted the equality of all inertial frames of 
reference in vacuo, thereby making the ether redundant. Not a 
single experimental fact contradicts special relativity. 

In his initial experiments Michelson could have discovered a 
variation in the speed of light in a change of its direction relative 
to the motion of the Earth down to 0.15 m/s; later experimenteis 
could have detected a variation of 0.015 m/s; with lasers the de- 
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tectable change in speed is a mere 3 X 10-5 m/s. Yet no variation 
in the speed of light relative to a moving observer has ever been 
detected. 

That the speed of light in vacuo does not depend on the motion 
of the source has been repeatedly verified. One experiment was 
carried out with an extraterrestrial source, the Sun (A. M. Bonch- 
Bruevich, 1956). If the velocity of light depends on the motion of 
the source, then by measuring the velocity of light emitted from 
two opposite points of the equator it should be possible to detect 
the difference between those veloci- 
ties. No such difference was detected. 
A laboratory experiment was conducl- 
ed in which the flight of gamma- 
quanta over a certain distance was 
compared; gamma-quanta emitted by 
a stationary and a moving source 
(radioactive nuclei) were studied, and 
again the independence of the velocity 
of light was confirmed. 

It can be confidently declared that, 
despite the enormous increase in the 
accuracy of experiments, there is no 
indication of the existence of a pre- 
ferred reference system, or of any 
Fig. S.6. Diagranimatic represen- difference in the velocity of light 
tation of the Soria canes ex- in vacuo in different inertial frames 

eee of reference, or of any manifestation 
of the ether 

In conclusion we should note that accelerated motion of a re- 
ference system relative to an inertial frame ot reference can, of 
course, be detected. A mechanical experiment of this kind — Fou- 
cault’s experiment — was described in § 1.5. There are also optical 
variants of the experiment, which we sha/l mention to round out 
the picture. We shall describe Garress’s experiment (1912), subse- 
quently repeated by Sagnac. Three mir-ors, A, B, C, and a semi- 
transparent plate D are mounted on a turntable (Fig. S.6) together 
with a light source L and photographic plate P. A beam of light Q 
is split at plate D into two beams, DABCDP and DCBADP, tra- 
velling along the path ABCD in opposite directions. If the system 
is at rest, interference of the two beams travelling from L and 
split at D occurs on the plate. When the turntable rotates the in- 
terference fringes should displace owing to the change in path 
lengths. 

Disregarding the deviation of a geucentric system from an 
inertial system, let us examine events in the reference system of 
the Earth. For the sake of simplicity we shall assume that there 
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are many mirrors and the path of the beams is virtually a circle. 
Then the speed with which the light catches up with the turntable 
when it travels in the direction of rotation is c— V=c—QR 
(where R is the radius of the turntable and Q is the angular veloc- 
ity of rotation); for the beam travelling in the opposite direction 
the velocity is c-+ V=c+QR. The time it takes the beam io 
travel around the circumference is t) = 2nR/(¢c— V) in the first 
case, and t2 = 2nR/(c + V), in the second. The difference between 
the two times is 
1 1 4nRV l 4SQ 
det — = 2a (Sy — apy) aa 
~ 


where S is the area of the turntable. Sagnac observed a shift in the 
fringes that agreed nicely with this formula. The shift can be used 
to determine the angular velocity Q. 

If the Earth is used as the turntable its angular velocity can be 
determined. This experiment was carried out in 1925 by Michelson 
and Heyl. The angular velocity corresponded to the component of 
the angular velocity of rotation of the Earth along a plumb line 
at the point of observation. For the experiment two kilometres of 
pipes were laid and a second circuit was built to determine the zero 
point of displacement of the fringes. Michelson’s result was 
0.230 + 0.005, the theoretical figure being 0.236. Excellent agree- 
ment! 

Thus, unlike uniform translational motion of the Earth, its rota- 
tion can be detected by various physical experiments. 

III. Was Michelson’s experiment “decisive” for the creation of 
the special theory of relativity? Michelson’s experiment is given 
great prominence in virtually all books on the history of special 
relativity. Most authors assume, one way or another, that special 
relativity was an upshot of attempts to explain Michelson’s ex- 
periment, which is the theory’s principal experimental basis 

That is the place assigned to Michelson’s experiment in the 
only book in Russian devoted to the experimental foundations of 
the STR, written by S.1. Vavilov in 1928: “The story of [Michel- 
son’s experiment] is set forth here in fairly great detail because 
the basic postulates of the relativity theory were formulated on its 
basis.” And in his foreword Vavilov wrote, “After reading this book 
the reader will understand why it is adorned with a picture of Mi- 
chelson.” 

This claim is repeated in virtually all our textbooks dealing 
with the history of special relativity. In Y. B. Rumer and M. S. Ryv- 
kin’s The Theory of Relativity (Moscow, 1960) we find: “Unlike all 
preceding investigators, Einstein saw the negative result of Mi- 
chelson’s experiment as...”. Foreign authors toe the same line on 
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this score. To cite but one example, Laue, in a book written in 
1911, states that Michelson’s experiment “became, as it were, the 
fundamental experiment for the relativity theory.” 

Besides textbooks and popular expositions there are, of course, 
books on special relativity written by Einstein himself. In his book 
on the special and general theories of relativity, which he subtitled, 
A Comprehensible Exposition (gemeinverstandlich), Michelson’s 
experiment is mentioned in § 16: “The Special Theory of Relativity 
and Experiment’. However, it is not clear from the text whether 
there was any direct connection between Michelson’s experiment 
and the enunciation of the theory. It is not mentioned at all. 
Nothing in other writings and statements of Einstein indicated any 
contradiction with the accepted view that Michelson’s experiment 
was actually the point of departure for special relativity. Students 
were taught so, and schoolchildren are told as much today. It was 
therefore most surprising to read in an article by R. Shankland, 
published in 1963, the following excerpt from his interview with 
Einstein dating back to 1950: 

“When I asked him how he had learned of the Michelson-Mor- 
ley experiment, he told me that he had become aware of it through 
writings of H.A. Lorentz, but only after 1905 had it come to his 
attention! ‘Otherwise’, he said, ‘f would have mentioned it in my 
paper!” Indeed, Einstein’s 1905 paper contains no mention of Mi- 
chelson’s experiment or references to Lorentz’s papers. 

We know that Lorentz and Poincaré came very close to postulat- 
ing special relativity, but in fact it was enunciated by Einstein, and 
virtually in that single work in 1905 Thus, if we speak of the 
decisive significance of Michelson’s experiment for developing spe- 
cial relativity, we should determine its influence on Einstein’s work. 
And from Shankland’s article we learn that Einstein first heard of 
Michelson’s experiment only after the creation of the special theory 
of relativity. 

Why should the question of the role of Michelson’s experiment 
in Einstein’s efforts to formulate special relativity be of such con- 
cern to the teacher? It is intriguing, of course, as a curious fact, 
but hardly worth devoting a whole paragraph to it. But it is not 
just a matter of curiosity. It is inevitable that the history of the 
development of a theory should be reflected in teaching. It is hard, 
and in some cases impossible, to sidestep the history of an issue. 
With time, of course, the presentation of a discipline becomes logi- 
cally streamlined (teaching is never in vain!), but the real history 
of a theory’s development does not always reflect the logic of its 
elaboration. Nature does not necessarily reveal its secrets in a 
sequence most convenient for their interpretation. In the initial 
period after the enunciation of a theory its academic presentation 
follows rather in the historical wake of its elaboration than accord- 
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ing to the logical scheme which can be constructed after the theory 
has been completed. 

Thus, looking from this aspect at the academic presentation of 
special relativity, we can see, that, if Shankland understood Ein- 
stein correctly, the teaching of the theory did not even follow the 
steps in its creation. A curious situation! 

The question is not of denying the role of Michelson’s experiment. 
Its history and implementation cannot but arouse admiration. Mi- 
chelson’s experiment occupies an outstanding place in the history 
of natural science. And yet it played a rather unfortunate part in 
tne evolution of the traditional scheme of describing special rela- 
tivity. 

When placed at the basis of instruction in the STR, Michelson’s 
experiment inevitably introduces the ether. It is impossible to ex- 
plain its meaning without speaking of the difficult search for a 
material medium through which light propagates (see Supple- 
ment II). But today we all know only too well that no such medium 
1s necessary, and from the methodological point of view that is 
where we should begin. There is absolutely no need to go back 
to the intellectual atmosphere of the later nineteenth century. 

Indeed, the ether played a prominent part in the physical views 
of the 19th century. It was, in fact, the ether concept which sug- 
gested to Maxwell the idea of the experiment ultimately carried 
out by Michelson and Mcrley. But erroneous notions, which may 
have played their part at a certain stage in the development of 
science, are ultimately discarded. When Galileo enunciated his 
inertia principle he immediately discarded Aristotle’s doctrine that 
motion had to be continuously supported. When the transformation 
of mechanical energy into heat was discovered the phlogiston con- 
cept had to be discarded. The theory of relativity began with the 
rejection of absolute motion and the ether. But nowadays no one 
brings up Aristotle’s doctrines in expounding mechanics, no one 
recalls phlogiston in lecturing on heat; why then should the ether 
be kept in describing special relativity at school and college? The 
introduction of the ether in modern expositions of special relativity 
is, to say the least, strange. First one has to explain at length the 
reason for introducing the ether, and then conclude by declaring 
that it does not, after all, exist. Can such an approach be called 
methodical? 

It is sometimes said that there is no getting away from the ether: 
it had to be postulated by analogy with the propagation of sound 
or waves on water. One should only be sure to explain at the right 
time that there is a substantial difference between the way electro- 
magnetic and gravitational waves travel, on the one hand, and 
elastic waves, on the other, and that matter is not required for the 
propagation of electromagnetic waves. Light can propagate in the 
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absence of matter in the conventional sense of the word (possessing 
rest mass). So there is no reason for the ether after all! 

To this one could reply, “True, it is better not to introduce the 
‘ether’, which doesn’t exist anyhow, but special relativity is a com- 
plicated thing. The road to it was long and difficult, and it lay 
through the ether hypothesis, which was discarded when the need 
for it passed. But ft was a natural theory which appeared in the 
search for a correct solution, it reflects the logic of human thinking, 
and there is no harm in setting it forth.” Everything in this reason- 
ing is correct except one thing. Einstein arrived at special relativity 
not via the ether (nor Michelson’s experiment), but along a 
simpler and clearer road. And if one speaks of the logic of human 
thinking it is worth taking a closer look at Einstein’s reasoning. 

What role did Michelson’s experiment play in Einstein’s work? 
We have already cited Einstein's statement as quoted by Shank- 
land. But Shankland’s article appeared after Einstein’s death and 
had not been, so to say, “authorized”. Not long ago a letter writ- 
ten by Einstein and containing a direct answer to the question was 
discovered in the Einstein archive at Princeton, and it removes all 
doubts. The story of the letter is as follows. On February 2, 1954, 
a year before Einstein’s death, a certain Mr Davenport wrote him, 
saying that he was looking for evidence that Michelson had “in- 
fluenced your thinking and perhaps helped you to work out your 
theory of relativity.“ Not being a scientist, he asked Einstein for 
“a brief statement in non-technical terms, indicating how Michelson 
helped to pave the way, if he did, for your theory.“ 

Einstein replied almost immediately, on February 9, 1954. It is 
his last statement on the question. It seems obvious that he had 
reflected about it before (after all, he had spoken of it with Shank- 
land). The letter is clear and unequivocal. Here it is 
“Dear Mr Davenport: 

“Before Michelson’s work it was already known that within 
the limits of the precision of the experiments there was no influence 
of the state of motion of the coordinate system on the phenomena, 
resp. their laws H.A. Lorentz has shown that this can be under- 
stood on the basis of his formulation of Maxwell’s theory for all 
cases where the second power of the velocity of the system could 
be neglected (effects of the first order) 

“According to the status of the theory, it was, however, natural 
to expect that this independence would not hold for effects of 
second and higher orders. To have shown that such expected effect 
of the second order was de facto absent in one decisive case was 
Michelson’s greatest merit. This work of Michelson, equally great 
through the bold and clear formulation of the problem as through 
the ingenious way by which he reached the very great required pre- 
cision of measurement, is his immortal contribution to scientific 
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knowledge. This contribution was a new strong argument for the 
non-existence of ‘absolute motion’, resp. the principle of special 
relativity which, since Newton, was never doubted in mechanics 
but seemed incompatible with electro-dynamics. 

“In my own development Michelson’s result had not had a con- 
siderable influence. | even do not remember if I knew of it at all 
when I wrote my first paper on the subject (1905). The explana- 
tion is that I was, for general reasons, firmly convinced how this 
could be reconciled with our knowledge of electro-dynamics. One 
can therefore understand why in my personal struggle Michelson's 
experiment played no role or at least no decisive role. 

“You have my permission to quote this letter. | am also willing 
to give you further explanations if required 


“Sincerely yours, 
Albert Einstein” 


The letter is clear, and there is nothing to add. True, it is con- 
tradicted by one pronouncement of Einstein’s known from B. Jaffe’s 
book. Einstein met Michelson but once, in 1931, in Pasadena. In 
a short speech, before those present, he said addressing Michelson, 
“Through your marvelous experimental work [you] paved the way 
for the development of the theory of relativity.” Jaffe’s book, from 
which these words are quoted, does not give Einstein’s speech in 
full. From an account of it in German it follows that the book 
omits a whole sentence and that Einstein spoke of the “road” to the 
general theory of relativity. Thus, there is not the slightest hint on 
Einstein’s part that Michelson’s experiment was in any way de- 
cisive, although Einstein invariably stresses its beauty and funda- 
mental contribution to science. In a letter to Jaffe [26] he writes 
that the experiment “strengthened my conviction concerning the 
validity of the principle of the special theory of relativity.” That is 
probably the most correct assessment of the significance of Mi- 
chelson’s experiment for Einstein’s work. The experiment’s signi- 
ficance in the history of physics was quite different, perhaps even 
“decisive”. The experiment’s “null” result quite obviously domina- 
ted the work of Lorentz and many others. But that was not the 
road along which special relativity evolved. Paradoxically, Lo- 
rentz, who discovered the famous “transformations” that bear his 
name and which embody the very essence of special relativity, 
remained a long way off from enunciating the STR. 

Finally, of interest is the question why academic instruction has 
been so persistent in following the “Lorentz way’? One fortuitous 
consideration which probably played no mean part ought to be 
oe Here is a small quotation from the book by H. Bondi 
21]: 
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“What has bedevilled this issue in textbooks is the undue pro- 
minence given to the Michaelson-Morley experiment ... . Einstein 
said that at the time he wrote his basic paper on relativity (1905) 
he had never heard of the experiment. Later on when it was decided 
to reprint various essays on relativity it was decided by the pub- 
lishers (with the advice of somebody) to start in the middle of one 
of Lorentz’s essays. The first part that was included happened to 
be the Michaelson-Morley experiment. For this reason since then 
everybody, or nearly everybody, has felt obliged to start in the 
same way. And what a complicated start it is!” 

Indeed, if we take Lorentz’s book Versuch einer Theorie der 
elektrischen und optischen Erscheinungen in bewegten Ké6rpern 
(Leiden, 1895), we see in §§ 89-92 a description of Michelson’s 
experiment. 

As for the exclamation, “And what a complicated start it is!”, 
far from everyone agrees with it. But the question of how most 
reasonably to explain special relativity to the student is not an 
idle one: today special relativity is not only a part of the college 
course in general physics, it is also present in the high school cur- 
riculum. Instruction should, doubtlessly, be based on the ideas of 
modern physics and not include outdated notions of the past. 

Einstein’s elaboration of special relativity began with his rejec- 
tion of the “luminiferous ether”, and in that sense Michelson’s ex- 
periment was certainly not “decisive”. Einstein’s reasoning is suf- 
ficiently simple and logical, and there is every reason to use it in 
expounding the special theory of relativity. 

IV. Why shouldn’t the mass-velocity dependence, or the rela- 
tivistic mass, be introduced? Textbooks on special relativity (es- 
pecially the older ones) often introduce the “relativistic mass”, 


Mre = my = m//i — Bf, P=v/c, 


which, by definition, depends upon the velocity, and try to give it 
independent meaning. 

Whether the relativistic mass should be introduced or not is a 
purely methodological issue. Whether #1,e: should simply be con- 
sidered an abbreviated notation poses no problem at all. But the 
physical interpretation of relativistic mechanics is an entirely 
different question. Here misunderstandings and vague interpreta- 
tions often arise. On this is our discourse. 

It is not hard to understand how the temptation to introduce the 
relativistic mass appeared. One need but juxtapose the Newtonian 
and relativistic equations of motion (5.37a) and (5.37b), 


& (mv) =F, (a) | i (myo) = F, (b) 
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for the conclusion to suggest itself that the “only difference” be- 
tween them is that in the relativistic equation the mass depends 
upon the velocity. The expression my is then taken from under the 
derivative sign and declared the relativistic mass, with an inde- 
pendent meaning attached to it. There are many objections against 
such an interpretation; they will be set forth later on. On the other 
hand, emphasis will be laid on the advantages of employing the 
invariant rest mass. 

A reasonable relativistic interpretation of mass should, like all 
relativistic mechanics, in the final analysis, rely on four-dimen- 
sional concepts. Although in many cases academic instruction is 
so concise that the introduction of four-dimensional concepts is 
impossible, we cannot forget that the construction of relativistic 
mechanics (Chapter 5) inevitably requires the introduction of a 
four-dimensional world. If we are forced to restrict ourselves to 
three-dimensional formulations of relativistic mechanics, then in 
interpreting its results we must go back to its very sources. 

Let us recall briefly what we did at the beginning of Ch. 5. We 
defined the 4-momentum vector as the product of the 4-velocity and 


a scalar, the rest mass: P= mb. As for the equation of motion, 


> 
the derivative dP/dt has entered its left-hand part. It can be seen 
from this (see also § 5.1) that the relativistic factor y under the 
derivative sign in (5.37b) appeared because in 4-space-time we 
employ invariant proper time instead of non-invariant coordinate 


time. The first three components of P include simply the first three 


components of the 4-velocity V, which bear no relation to dyna- 
mics. Thus, the factor y refers to the properties of 4-space-time, not 
to the internal state of the particle. 

When we introduced a scalar, the rest mass, then in 4-space the 
quantity obtained exact transformational properties; in other words, 
it is at once possible to indicate the law according to which it 
changes in passing from one inertial frame of reference to another. 
The rest mass is a scalar, i.e. invariant. This is a very important 


= 
point. Since P? = — mc? (see (5.47)), it is apparent that the rest 
mass is proportional to the square of the absolute value of the 
energy-momentum 4-vector of a particle. 

As in classical mechanics, we want to associate the mass with 
the properties of the particle itself, in which case the only reason- 
able method of introducing the mass is in terms of the rest mass. 
One could, of course, say that a particle’s acceleration to relativis- 
tic velocities causes changes in its internal properties, notably 
mass. But even if we do not touch the particle and introduce an- 
other inertial reference frame, the particle’s equation of motion will 
remain (5.37b). Thus, taking “relativistic mass” at face value, it 
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increases for no physical reason. Such a result is hardly satisfac- 
tory. 

There is no need to determine the “relativistic mass” experi< 
mentally. Only microparticles actually attain relativistic velocities, 
and the rest mass is sufficient to identify them. It is easily found 
if we determine the particle’s energy and momentum from equation 
(5.50) : 


F? . 
mc? = —- — p?. 


This is just what is done in high-energy physics. 

But perhaps the “dependence of mass on velocity” can be veri- 
fied directly? We should first note that no unique dependence of 
mass on velocity follows from the mechanics of special relativity. 
As pointed out in § 5.3, the essence of the matter is that, unlike 
Newtonian mechanics, in relativistic mechanics the directions of 
acceleration and force do not, in the most general case, coincide. 
In Newtonian mechanics a body’s mass can be determined from 
the ratio of the magnitude of the force to the magnitude of the ac- 
celeration it imparts to the body: m = F/(dou/dt). If we similarly 
determine the mass in relativistic mechanics we arrive at the mass 
tensor (regarding tensors see Appendix I, § 3). Indeed, let us 
rewrite (5.38) in the form (a = |, 2, 3) 

04% 
c? ) Fp 


aay [Fa (Fy09)] = (805 — 


whence we see that the acceleration is a linear vector function of 
the force, the factors of which (i.e. the tensor components) depend 
upon the velocity of the body. These factors define the inverse mass 


tensor: 
1 0,0 
-1=>_ _ 2B 
Map ay (6.5 a i. 


The appearance of the mass tensor has a simple physical mean- 
ing: the magnitude of the acceleration depends upon the mutual 
directions of the force and the velocity. A particle’s velocity re- 
presents a kind of preferred direction. For simplicity’s sake we 
direct axis | along the velocity. Then v, = 8,,v, where v is the 
absolute value of the velocity. We have 


-)_ _! 0? 
mop —= “my (S.5 ~ Gz bib) ‘ 


y? 0 0 
mt 0 1 0 
001 








or 
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We find the mass tensor according to the conventional rule (the 
co-factors divided by the magnitude of the determinant): 


1 0 0 
Mag=my?}O0 y? O}, 
0 O y? 


whence it is obvious that the principal values of the mass tensor 
Mag are one “longitudinal” mass and two “transverse” masses. 

But the “mass tensor” appeared solely from the desire to intro- 
duce the “dependence of mass on velocity”; the appearance of such 
a tensor is obviously unjustified: the rest mass is quite adequaic 
for interpreting any results. 

Thus, the answer to the question regarding the “experimental 
detection” of the dependence my is that individual components of 
the tensor m1,g can be determined. In particular, the “transverse 
mass” is easily found in the motion of a charged particle in a 
magnetic field (see § 5.5); the “longitudinal mass” is obtained in 
the motion of a charged particle without an initial velocity in a 
uniform electric field. But if we speak of all experiments with re- 
lativistic particles in general, the simplest thing is to say that they 
confirm the relativistic equation of motion. In the two considered 
special cases when the equation of motion resembles the Newto- 
nian (the directions of acceleration and force coincide) the equa- 
tion of motion indeed looks as though the change in mass is due 
to the velocity. It is, however, different for the two cases. In all 
other cases the equation of motion is substantially different from 
the Newtonian. Without taking this into account one can come up 
against some paradoxes (see § 8.2). 

If we employ only the rest mass in relativistic mechanics, it is 
significant from the methodological point of view that the con- 
cept of mass introduced at school remains the same. The concept 
of rest mass is “visualized” in the usual way from Newtonian me- 
chanics; it is the same mass that enters the relativistic relation- 
ships of dynamics, though it can no longer be defined as the ratio 
of force to acceleration, but it can be defined according to (5.68). 
The rest mass is simply the mass used in Newtonian mechanics. 

Indeed, at B < 1 equation (5.37b) becomes (5.37a), insofar as 
in this case y ~ 1. And the determination of mass in Newtonian 
mechanics presents no difficulties. It is more important to stress 
that in relativistic mechanics the properties of space-time come 
into play and the laws of mechanics change, but we retain the in- 
variance of the rest mass as a characteristic of a particle. 

Sometimes attempts are made to link the increase in the energy 
of a particle (or a system) with an increase in mass. This is also 
a redundant interpretation. The dependence (5.46) shows that all 
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forms of energy increase equally when a particle (system) is exa- 
mined not in its proper frame of reference. 

The transformational properties of “relativistic mass” are also 
highly unsatisfactory. “Relativistic mass”, which is proportional 
to the energy of a particle, should transform as the fourth compo- 
nent of the energy-momentum 4-vector. As distinct from it, the 
rest mass is, as mentioned before, an‘invariant which, like charge, 
characterizes an elementary particle. It is sometimes pointed out 
that the rest mass can change (see § 5.6), whereas the “relati- 
vistic mass” is always conserved, as long as the energy conserva- 
tion law holds. But at the same time the conservation of relativistic 
mass yields absolutely nothing in comparison with the law of con- 
servation of energy: it is simply a corollary of that law. Thus, the 
law of “conservation” of relativistic mass is a redundant equation. 

The introduction of the relativistic mass of particles and the 
law of its “conservation” leads to the introduction of a “photon 
mass”, ftv/c?. In § 7.6 we specifically dealt with the inexpediency 
of employing this quantity. 

As is known, one can take as the primary principles of the spe- 
cial theory of relativity, not Einstein’s two postulates but, for 
example, his first postulate and the mass-velocity dependence [32]. 
Formally, special relativity can be developed on such a basis. But 
this, of course, does not add clarity to the physical meaning of re- 
lativistic mass. It is worth emphasizing that Einstein’s postulates 
possess a clear advantage over other possible postulates since they 
permit a direct physical interpretation and explicitly stress rela- 
tivistic features in determining the coordinates of an event. 

Sometimes it is pointed out that many eminent physicists in- 
troduced the relativistic mass. This is hardly a potent argument, 
the truth being that most leading physicists were against it [9, II, 
34, 35]. It is of interest that after lengthy discussions of relativistic 
mass Robert Feynman wrote that, strange as it may seem, the 


equation m=m)//1 — v%/c?_ is rarely employed in practice. In- 
stead, there are two irreplaceable relationships which are easily 
proved: &? — P*c?= Moc’ and Pe = u/c. 

Summing up, we can say that the invariant rest mass possesses 
indubitable advantages, whereas relativistic mass is a source of 
numerous misunderstandings, while adding nothing of substance. 

V. Non-inertial frames of reference. The special theory of re- 
lativity and the advance to gravitational theory (the general 
theory of relativity). The laws of dynamics help us single out 
inertial systems among all other possible frames of reference. We 
have defined inertial systems as those in which all three of New- 
ton’s classical laws hold. Newton's third law explicitly stresses 
that force is a consequence of interaction of bodies. In non-inertial 
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systems it is no longer possible to preserve all three laws. If the 
second law is retained we must introduce forces which do not 
satisfy the third law: inertia forces. Let us examine two examples 
to recall how this is done. 

In the STR a reference frame is a rigid body; the most general 
motion of a rigid body is a combination of translational motion 
and rotation. Arbitrary motion of a non-inertial frame of reference 
relative to an inertia! system is a composition of accelerated trans- 





(a) 


Fig. S.7. (a) A buggy with a suspended bead rolls down an inclined plane witlr 
constant acceleration a. in steady motion the thread of the pendulum is deflected 
somewhat from the normal to the inclined plane The force of gravity and ten- 
sion of the thread combine to yield a resultant imparting to the bead the accele- 
ration, a, needed for its motion together with the buggy This is the reasoning 
of an observer in the inertial system K connected with the “stationary” inclined 
plane. In this system the law of Newtonian dynamics holds: acceleration is 
caused only by forces (6) The same buggy and bead are examined from the 
point of view of system K’ connected with the buggy, which is moving with 
acceleration This 1s a non-inertial system and the inertia force —na must be 
introduced. The bead is at rest relative to the buggy. hence the resultant of the 
three forces acting on it— gravity, the tension of the thread, and the inertia 
force — must vanish. It will readily be observed that the angle of deflection of 
the thread from the normal to the plane of the buggy is the same as in case 
(a), as it should be. 


lational motion and rotation (uniform or accelerated). Uniform 
translational motion leaves us within the confines of inertial sys- 
tems. Our examples refer to accelerated translational motion and 
uniform rotation. 

Example 1. A buggy of mass M is rolling without friction down 
an inclined plane. Suspended from a bracket on the buggy is a 
heavy bead of mass m. Determine the angle between the thread 
supporting the bead and the normal to the inclined plate 
(Fig. S.7) for the case of steady motion. 

(a) Reasoning from the point of view of the inertial coordinate 
system K (connected with the inclined plane). The buggy is mov- 
ing with uniform acceleration a = g sina directed parallel to the 
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inclined plane. If the bead is at rest relative to the buggy it must 
be subject to the same acceleration. But acceleration is due to 
force, and acting on the bead are only two forces: gravity and the 
tension of the thread. For them to yield a resultant parallel to the 
inclined plane they must be at an angle. Knowing the direction of 
acceleration of one of them (the force of gravity) it is simple 
graphically to construct the direction and magnitude of the second 
Jorce (the tension of the thread). 
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Wig. S.8. (a) A suspended bead rotates together with the turntable of a centrl- 
fugal machine. In steady motion the thread of the pendulum is deflected some- 
what from the vertical away from the axis of rotation Compounded, the tension 
of the thread and the force of gravity yield a resultant force which gives the 
bead the centripetal force equal to mw*r and directed toward the rotation axis, 
needed to make the bead rotate together with the turntable. This is the reason- 
ing of an observer in an inertial system K located outside the turntable. In 
this system Newton's second law holds, and the cenlmpetal acceleration is pro- 
duced by the compounding of two “real” forces (6) An observer on the turn- 
table, that is, in the non-inertial system K’, will describe the same phenomenon 
differently. In the K’ system the bead is at rest, hence the sum of all actin 

forces must be zero Bul now there are three forces: the tension of the thread, 
the force of gravily, and the inertia force, equal to —mw*r Compounded, they 
yield the resultant force equal to zero. From the reasoning (6) it follows that 

angle a has the same value as in (a). 


(b) Reasoning from the point of view of the non-inertial coor- 
dinate system K’ (connected with the buggy and the bead). In 
this system the bead is simply at rest, hence the resultant of all 
the forces acting on it is zero. But in addition to the force of 
gravity and the tension of the thread we must take into account 
the inertia force —ma. It will be readily observed that we arrive 
at the same result. 

Example 2. A bead of mass m suspended on a thread is placed 
on a centrifugal machine rotating at constant angular velocity o 
(Fig. S.8). The bead is at a distance r from the rotation axis. De- 
termine the angle of deflection of the thread from the vertical. 

(a) Reasoning from the point of view of the inertial coordinafe 
system K connected with the “stationary” stand of the centrifugal 
machine. For the bead to move together with the thread it must 
experience a centripetal acceleration mw*r. For this the thread 
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must deviate from the vertical; then the resultant of the force of 
gravity and the tension of the thread for a given deflection from 
the vertical can make for the required centripetal acceleration. 

(b) Reasoning from the point of view of the non-inertial coor- 
dinate system K’ connected with the turntable. In this system the 
bead is at rest, hence the resultant of all the forces acting on it is 
zero. In addition to the force of gravity and the tension of the 
thread we must introduce the centrifugal inertia force —mw?r. The 
angle of deflection of the thread from the vertical, a, is, of course, 
in both cases the same. 

These two examples show how inertia forces can be used to pre- 
serve Newton’s second law in non-inertial coordinate systems. 

These examples do not include certain other types of “inertia 
forces”, but their essential features are apparent from them. Inertia 
forces are proportional to the “inert” mass of a body; they are 
either constant over all space (Example 1) or they increase in- 
finitely with infinite recession from the axis of rotation (Example 2). 

Galileo already knew that all bodies on Earth fall at the same 
rate, i.e. that the force of gravity imparts them the same accele- 
ration. But inertia forces possess the same property. Thus, mate- 
rial bodies react identically to inertia forces and gravity forces. 
And another peculiarity of gravity forces is known from experi- 
ence: there is no shielding from them (it is, in principle, possible 
to get rid of all other forces). That is why no direct experimental 
verification of Newton's first law is possible on or near the Earth. 
Newton himself pointed out that to verify that law one would have 
to get to a place where there are no gravitational fields; that is 
why it was stressed in Chapter 1 that Newton’s first law is a 
postulate. 

Einstein's gravitational theory possibly originated when he got 
the idea of the equality of all frames of reference. It seems to con- 
tradict everything discussed in this book, which has repeatedly 
emphasized the special role of inertial frames of reference. But let 
us not be in too great a hurry. 

If it is impossible to get rid of either gravity or inertia, we can 
try to regard inertia and gravity as different aspects of the same 
phenomenon. Then Newton’s first law must be formulated diffe- 
rently. The first part of Newton’s statement of the law remains the 
same: free motion of a body is motion with no forces acting on the 
body, gravitation being excluded from the category of “force”. 
Formerly, according to Newton, free motion meant uniform motion 
in a straight line. Now, according to Einstein, free motion is iner- 
tial and under the action of gravity forces. Gravitation is no longer 
a force. Now the action of forces is considered only when a body’s 
motion is deflected from free motion, which in the Newtonian 
scheme was called free fall. According to Einstein, inertia and 
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gravity together condition “free” motion, they constitute its “back- 
ground”. 

Of course, free motion in the Einsteinian sense is by no means 
in a straight line. In Euclidean geometry (on which Newtonian 
mechanics rests) a straight line is the shortest distance between 
two given points (or, as mathematicians call it, a geodesical line). 
We shall have to recall this a little later. 

Let us get back to the conclusions obtained at the beginning of 
this Supplement. Passing to non-inertial reference frames simu- 
lated the appearance of inertia forces proportional to the inert 
mass of the body. If we recall that the gravitational mass and 
inert mass are equal (or proportional), it becomes apparent from 
the first example that passing to a reference frame in translational 
motion, but with acceleration, simulates the appearance of a uni- 
form gravitational field of magnitude —ma. From the second 
example it can be seen that going over to a uniformly rotating 
frame of reference also leads to the appearance of a field of force 
proportional to the body’s mass. In the general case, it can be 
asserted that passing to non-inertial frames of reference simulates 
a gravitational field. These fields have a peculiarity that distin- 
guishes them from “genuine” gravitational fields: they do not 
vanish at infinity, but they do vanish in passing over to inertial 
frames of reference. 

We have seen that a transition from one inertial frame of re- 
ference to another does not affect the square of the interval between 


events 
ds? = c? df? — dx? — dy’ — dz’. (S.V.1) 


In passing from an inertial to a non-inertial reference frame, 
ds? changes its general form. Indeed, let us consider two examples 
of passing from an inertial to a non-inertial frame. 

Example 1. A coordinate system K’ is moving in a straight 
line relative to K with uniform acceleration a: 


xx’ +al/2, dx=dx’ —at' dl’; 


y=y', dy = dy’; 
z=2', dz= dz’; 
t=, dt=dt’. 


If K is an inertial system and 
; ds? = ¢? d? — dx? — dy’ — d2*, 
then in system K’ 
ds? = c? dt’? — (dx’ — at’ dt’)? — dy’? — dz”, 


ds* = (c? — a°t’?) dt’® — 2at’ dx’ dt! — dx’ — dy’? — dz”, 


or 
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Example 2. A uniformly rotating coordinate system (the an- 
gular velocity of rotation is 2). From equations (A.1.10) of the 
Appendix | we get (6 = 1) 


x= x’ cos Of — y’ sin Qt, 
y=’ sin Qt + y’ cos Qt. 
It is not hard to find that ds? transforms to the form 
ds? = [c? — Q? (x? + y”)] dt? + 2Qy/’ dx’ dt’ — 2Qx’ dy’ dt’ — 
— dx”? — dy’? — dz’, 


It can be shown that in either case no time transformation can 
reduce ds? to the algebraic sum of the squares of the differentials 
of the four coordinates. 

Thus, in the general case, going over to non-inertial systems 
changes the expression for the (invariant) interval between events, 
and in such a way that it no longer reduces to the “Galilean” form 
(S.V.1). Let the metric of 4-space be written down in the general 
case as 

ds? = gy, dx' dx", (S.V. 2) 


where g,e depends upon all four coordinates and summation is 
assumed over the indices i and &. The difference between the 
4-space metric (S.V.2) and the Galilean metric (S.V.1) is, ac- 
cording to Einstein’s ideas, due to gravity. Thus, the difference 
between giz and the Galilean values (goo=c?, g1:—=922—=g33= 1) 
reflects the existence of gravitational fields. But gravitational fields 
are associated with matter. Thus, the geometric properties (me- 
tric) of space-time are by no means invariable and they depend 
upon the physical objects within it. Knowledge of the space-time 
metric makes it possible to answer the fundamental! questions that 
usually interest physicists. The question arises: how can these gir ’s 
be found? Einstein was able to write a set of (non-linear) differ- 
ential equations in partial derivatives which must be satisfied by 
ten values of giz. These values depend upon the distribution of 
matter and electromagnetic radiation. Einstein’s equations have 
so far been solved only for a few special cases. 

Let us briefly sum up. Passing to non-inertial reference frames 
causes the appearance of metric coefficients g,, differing from the 
Galilean and simulates the appearance of a certain field of force 
proportional to the mass. It can therefore be assumed that the 
values of giz reflect the existence of a field of force similar to a 
gravitational field. But a similar assumption is made with respect 
to “true” gravitational fields, which remain in going over to iner- 
tial frames of reference. It is obvious from this that if, in an iner- 
tial frame of reference, the square of the interval is determined 
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according to (S.V.1), it means that there are no gravitational 
fields. 


The difference between fields appearing in a transition to non- 
inertial frames of reference and true fields is that the g,e’s corres- 
ponding to “true” fields cannot be reduced to the Galilean form 
(S.V.1) by any time or coordinate transformations. From the geo- 
metrical point of view, 4-space-time which includes gravitational 
fields is no longer flat. It is warped. But here we should stop and 
refer the reader to special literature (see, for example, [31]). 

It is only necessary to note the following. In terrestrial condi- 
tions we successfully apply special relativity, ie. employ the in- 
terval (S.V.1) along with the Newtonian theory of gravitation, 
i.e. we consider gravity a force. Newton’s gravitational theory is 
explicitly non-relativistic, being a theory of action-at-a-distance. 
Nevertheless, the results are excellent (for example, in calculating 
the motions of celestial bodies). But Einstein’s theory predicts that 
that is precisely as it should be, in certain conditions, of course. 
It is in “weak” gravitational fields (and within the solar system 
all gravitational fields are weak; there is no need to cite exact 
criteria) that Einstein’s gravitational equations reduce to Newton’s 
equation of gravity (Poisson’s equation). As to the velocities of 
celestial bodies, they are always non-relativistic. 


MAIN EVENTS RELATED 
TO THE HISTORY OF THE STR 
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Doppler effect, 1842. 

Foucautt pendulum experiment, 1851. 

Laboratory determination of the speed of light, Fizeau, 1849, Fou- 
cault, 1862. 

Determination of the speed of light propagation in moving water, 
Fizeau, 1851. 

Development of the theory of electromagnetic field, Maxwell, 
1856-1864. 

Michelson’s first experiment, 1881. 

Publication of E. Mach’s book Die Mechanik in ihrer Entwicklung, 
1883. 

Improved Michelson’s experiment, 1887. 

Discovery of radioactivity, Becquerel, 1896. 

Discovery of electron, J. J. Thomson, 1894-1896. 

Kaufmann’s study of the movement of relativistic particles in elec- 
tromagnetic field, 1902. 

Lorentz’s papers devoted to the electrodynamics of moving bodies, 
1892-1904. 

Poincaré’s papers on relativism, 1895-1905. 

Poincaré’s speech in St. Louis, 1904. 

Einstein’s paper “On the Electrodynamics of Moving Bodies”, Ann. 
d. Phys. 17, 891 (1905). 

Minkowski’s lecture on space and time, Phys. Zs. 10, 104 (1909). 


Main Contributors to the Development of Space 
and Time Science 


Copernicus (1473-1543) Doppler (1803-1853) 
Galileo (1564-1642) Maxwell (1831-1879) 
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Descartes (1596-1650) Poincaré (1854-1912) 
Huygens (1629-1695) Minkowski (1864-1909) 


Newton (1643-1727) Einstein (1879-1955) 


APPENDIX I 


Here we present some mathematical data needed tor reading 
this book. 

§ 1. The symmetric notation. The summation rules. When a 
rectilinear orthogonal (Cartesian) system of coordinates is intro- 
duced in the three-dimensional space, the unit vectors oriented 
along the x, y, z axes are denoted by i, j, & respectively. The posi- 
tion of any point in space is determined by the radius vector 
r= xi+yj-+zk whose components are the coordinates of the 
point. All directions in space being equivalent, it is expedient to 
introduce the symmetric notation and to write, for example, x, x2, 
x3 instead of x, y, z, and mj, my, my instead of i, j, k. Then the 
radius vector of a point will be written in the following form: 


3 
r= xm, + XoMt, + XgMty =) XqgMa, (A.I.1) 


where > denotes the summation carried out in this case over a 
running from | to 3. The summation sign can be omitted, having 
once and for all stipulated that the two identical Greek letter in- 
dices appearing in one side of an equation imply the summation 
from 1 to 3. 

The special theory of relativity employs a four-dimensional space 
with four coordinates x, = x, x2) =y, %3 = 2, x;= ict. The ra- 
dius vector and other vectors have in this case four components. 
The summation rule is valid, but the summation is carried out for 
Latin letter indices running over all values from 1 to 4. For 
example, 


4 
A,B, = » A,B; = A,B, + AoBz + A3B3 + AyBy. 


a 


So, the concise summation rule prescribes the summation over two 
identical indices appearing in one side of an equation; in the case 
of Latin letter indices the summation runs over the values from 
1 to 4. 
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Let us go back to the three-dimensional case. The radius vector 
(A. I. 1) can be written in an abbreviated form as r = xgm, and 
arbitrary vectors a and 6 as 


a=a,m,=a,ym,=—a,m,, b=bym,=b,m, = bm. 


The same equality is written out several times in order to show 
that the summation indices are “mute”, i.e. the summation can be 
carried out over any letter index with the result remaining con- 
stant. 

To illustrate the application of the abbreviated notation let us 
derive the formula for the scalar product of two vectors a and 0. 
On the one hand 


ab = a,m,b,m, = dgbgmamM,. (A.1.2) 


Here we take into account that the summation rule pertains to two 
identical indices; there are two summations to be carried out in 
Eq. (A.1.2), each of which being performed over its own letter 
index. On the other hand, unit vectors are mutually orthogonal; 
therefore, each vector yields unity when multiplied by itself and 
zero when multiplied by any other vector. Consequently, 


={ 1, a=, 
mMa™=V 0, aX. 


It is convenient to introduce the Kronecker delta possessing exactly 
these properties: 


(A.1.3) 


8 Va ae All 

2-10, oFB. ee) 
This symbol differs from zero only at a = f, and any summation 
involving this symbol yields the simple result: ag5gg = ag. Indeed 


Gqd1q = 2194) + A2dj2 + A3613 = a}. 
Now we can easily complete the derivation of Eq. (A. I. 2) 
ab = a,b,mam, = dgbgbag = Gaba 


to obtain the conventional formula for a scalar product of vec- 
tors. 

The identical indices over which the summation is carried out 
may appear in a numerator and denominator of a fraction. The 
summation rule is valid in this case as well. Let us write, for 
example, the expression for a gradient of the function f and a 
divergence of the vector a: 

OE es Oh of OF sae! 
gradf = xq Ma Gz, M1 T+ By, M2 1+ |, = 
da, da, 
Ox oy 


0a, 
Oz ° 





of of of : Oa, 
ge) tego a dve=> = 
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When a Greek (or Latin) letter index appears alone, it is meant 
to be “free”, taking on any value out of the three (or four) possible 
ones. For example, 6, denotes one of the coordinates of the vec- 
tor b, that is by, be or bs. 

§ 2. The transformation of coordinates in the case of a rotation 
of the Cartesian system of coordinates. Let the radius vector of 
the point M be expressed as r = x,m,-in the “old” coordinate sys- 
tem. After the rotation of the coordinate system the radius vector 
of the same point M will be written in the “new” coordinate sys- 
tem as r=xgmp where xp are the coordinates of the point 
in the system after the rotation and mg are the new unit vec- 
tors. It is not difficult to define the relationship between the coor- 
dinates of the old and new systems. Let us write down the equality 
expressing the “conservation” of the vector r, 


— om! 
XM, = X4Ms, 


and multiply its both sides by mj, ie. an arbitrary unit vector 
of the new coordinate system. The left-hand side yields 


XMM) = x,4,,; 


t -—— 
here we used the designation mgm) = cos (mg, my) = aay; thus, aay 


represents a cosine of the angle between the vector m, of the old 
system and the vector mj of the new one. On the right-hand side 
we obtain the following chain of equalities: 


XgM,M = Xq5g, = X). 
Thus, ; 
xy = Og yXq (y= 1, 2, 3). (A.1.5) 
The new coordinates are expressed via the old ones linearly, 
the coefficients being the cosines of the angles between the old and 
the new coordinate axes. We have to find the coefficients of the 
expansion of the old unit vectors in terms of the new ones. Let us 
expand the old vector my via the new ones: 

m = ay,mM, (A.1.6) 
where aq, are unknown coefficients. To find them, let us multiply 
both sides of this equation by mj. Similarly to the foregoing for- 
mula 

a,, =a,,,5 5 (A.I.7) 


ap ny Oy 


We have obtained an aie result: the coefficients of the expan- 
sion of the unit vector m, with respect to the new unit vectors are 
the cosines aay. r 
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The cosines of the angles between the old and new vectors can 
be combined in the matrix: 


Qy Ae Aig 
c=(en an ea}: (A.1.8) 


Q3, G32 gg 


in the designation agg the first index, a, indicates the row, and the 
second, B, the column of the matrix (A.1I.8). Thus, the transfor- 
mation of coordinates is determined by nine coefficients agg. It is 
known, however, that the position of any solid body (in our case 
the coordinate system) whose one point is motionless can be defined 
by three parameters (three Eulerian angles). Whence it is clear 
that there are only three independent coefficients among the nine 
coefficients of the matrix agg. It is easy to find the requisite rela- 
tions between the coefficients azg. Indeed, the rotation of the coor- 
dinate system does not affect the distance from the origin to any 
point: r?= x3 = x. But x{ =a,,x,. To square this expression, we 
have to multiply the sums whose summation indices should be 
different: 


x = — 
HM Ugg XpQyghy = AggQygh pty 


But, on the other hand, this expression is equal to xi. This can 
be only if 
Agatya = Say (8B, Y= 1, 2, 3). (A.1.9) 


Although at first glance there are nine conditions here, these 
equations do not change when their indices B and y are inter- 
changed. Consequently, there are only six independent equations 
here. Each of them represents a product of the fth and yth rows 
of the matrix (A.I.8). (The matrix row multiplication consists in 
the summation of pairwise products of the respective elements.) 
Eq. (A.I.9) means that the product of any row by itself is equal 
to unity and by any other row to zero. Since the order of the row 
multiplication makes no difference, e.g. the product of the first row 
by the second row is equal to the product of the second row by the 
first one, the number of independent equations is equal to six and 
not to nine, as it was indicated. 

The relations obtained are best illustrated by the example of the 
rotation of axes in the coordinate plane (x;, x2). In this case 


HAH, A Ag Xyy X= AyyX, + Ayo. 
In accordance with Fig. (A.1), 
a, = cos 8, a, = sin 8, 
as=— sin8, ‘a~y—=cos8, 
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and therefore, 


xj=x,cos@+x,sin8, x,—=— x, sin@+ x, cos 6. (A.1. 10) 


Passing to the conventional notation x, = x, x2. = y, we obtain 
the well-known equations of analytical geometry: 
x’=xcos0+ysin8, y’=—xsin 6+ ycos6. 
We used these equations in the derivation of the Lorentz trans- 


formation. We have obtained the equations of the direct transition 
(from the unprimed to primed system). 
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Fig. A.1. The illustration of the general formulae for the coordinate transforma- 

tion by means of a rotation of the Cartesian system in a plane. Such a rotation 

is defined by one parameter 8. The angles between the old and new coordinate 
vectors are seen in the figure. 


The reverse transition equations are obtained in much the same 
fashion. We shall write them out together with the direct transition 
equations: 

Xq = AggXps Mm, = d,,Mz, 
vA a (A.L11) 

Xq = TapXp, Mm, = Ms, 
at the same time 

~— 
gg = mm, = cos (m,m,). 

Surely, the reverse transition equations (see (A.I.I11)) can be 
obtained automatically from the direct transition equations by ex- 
changing primed and unprimed quantities and replacing the angle 
@ by —6 (which corresponds to the rotation in the opposite direc- 
tion). 

How are the vector components transformed on the coordinate 
transformation? This can be easily found by the same technique 
that we used when deriving equations for the transformation of 
coordinates. We may not do this, however, having noticed that the 
coordinates are also the components of a vector, that is, of the 
radius vector. Therefore, it is clear that the vector components are 
transformed as the coordinates are, i.e. 


b= Aygbp, 0, =ag,bp. (A‘1.12) 
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As we have already mentioned, the four-dimensional (pseudo- 
Euclidean) space used in the special theory of relativity includes 
formally one imaginary coordinate associated with time 


Xy=xX, wY=Y, %3=2z, x=ict. 


The Lorentz transformation corresponds to linear transforma- 
tions in this space: 





X, =O, XX, =A,,%4, (A.1.13) 
and 
r oo —obr 
0 10 0 1 
ay, 0 01 0 ’ ay La, (A.1.14) 
| ee ee 
Broo r e 


where V is the relative velocity of two reference frames. 
The Lorentz transformation coefficients a,, satisfy the following 
conditions: 
Opi/2 pm = Sims (A.1.15) 


These equalities mean that the product of the Lorentz transforma- 
tion matrix rows yields unity when a row is multiplied by itself, 
and zero when a row is multiplied by any other row. 

Let us calculate the determinant of the Lorentz matrix * 


r 00 —ibr 


0 10 O 

—f —iRBP)(— 7 = [2() — B?) — 
0 0 1g [SPH eBr)(—éBr) = 12 (1 — BY) = 1 
Broo r 


(the simplest way is to expand the determinant into the first row 
elements). The Lorentz matrix determinant proves to be equal to 
unity. This means that we deal with the proper Lorentz transfor- 
mation, i.e. we stay in the systems of clockwise triads of unit 
coordinate vectors and do not pass to the systems of anticlockwise 
triads. 

Let us write the direct and the inverse Lorentz transformation 
for coordinates in full: 


x, =0 (x, — Bx), x=, x= 45 x4,= 0 (x, +iBx)), (A116) 
x, =T (x, +iBx,), =x xy xy 44=T (x, —iBx,). (A117) 


* Some data on determinants can be found in § 6 of this Appendix, This de- 
terminant is calculated there as well. 
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As to 4-vector components, they are transformed as coordinates, 


and, consequently, we get for the vector A (Ay, Az, As, Ag) (the 
arrow over a letter denotes a 4-vector): 


Aj = aziAr, A; = ain Ag, (A.1.18) 

Ai =I (A, + iBAy), A3=A2, A3=As, AG=T (A, —iBA)), 
(A.1.19) 

Ai = (Aj +iBA{), Ao=A3, As=A3, Ay=T (Aj + iBA)). 
(A.1.20) 


From Bq. (A.I.15) it follows that the scalar product of two 
4-vectors is invariant under the Lorentz transformation. Indeed, if, 
A, =@irAr, By =GimBm then 


>> 
AB = A,B, = 044A, jm: Bry = % 4% pA Bin = Spm Ap Bm = A,B; 


im~m 


The comparison of the second and the last link of the written chain 


> 
of equalities proves the invariance of the scalar product AB. 

§ 3. The tensors. Vector quantities are a particular case of ten- 
sors, mathematical quantities of much more complicated nature. 
Prior to passing over to them, let us point out the most essentiat 
points in the definition of a vector. In a given coordinate system 
a vector represents a directed line segment characterized by its 
own coordinates. But since a coordinate system is chosen at will, 
the vector coordinates are random. What is important, however, is 
that using the vector coordinates determined in one Cartesian sys- 
tem, we can find its Cartesian coordinates in any other system by 
means of Eq. (A.I.5). These are the transformation equations 
which define the vector. Thus, the vectorial] nature of quantities is 
revealed under the transformation of coordinates. 

To clear up the concept of a tensor by means of a specific 
example, we shall recall how the relationship between the electric 
induction vector D and the external electric field strength E is in- 
troduced in electrostatics. By and large, the relation D = D(E) = 
= D(E,, Eo, E3) is unknown. Let us expand D into the compo- 
nents Dg: 


D (E) = D, (E) my = D, (Ej, E2, Es) tg. 


Assuming that in the absence of the external field (E = 0) the 
vector D is also equal to zero (D(0)= 0) and assuming the ex- 
ternal.field to be small as compared to electric forces acting be- 
tween molecules of a substance, we can expand the unknown vector 
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function D in a Taylor series: 


on E+ oP “ Eo+ on wo) 


on 0) 














Eat . 
(Alan 


Di= E+ ...= 


ou 7 








D,= Ext ..- Ds= 22M £4 ..., 
where the summation is carried out over the index B. Due to the 
smallness of the field E its components £,, E. and £3 are also 
small (in fact, this is a good approximation in the case of real 
fields, except for the fields generated in laser beams), and we can 
take only the linear terms, neglecting all the others. Let us denote 
the constant quantities, which are the derivatives at the zero point, 
as follows: 

6D,0 

“OE, fag: 


Then the expressions obtained can be written in the form 
Dy = ey E; + € Es + ei3E3, 
Do = €9,E; + Es + €x3Es, (A.1.22) 
Dz = €3,E; + €32E2 + eg3E3 


or in the abbreviated form 


Da = lagEg. (A.1.2 2} 
Using its components, we can easily compose the vector D: 
D=e,,E,ma. (A.1.24) 


The relationship between two vectors expressed by Eqs. (A.I.23) 
and (A.1.24) is referred to as a linear vector function; in other 
words, the vector D is a linear vector function of E. 

Using Eq. (A.I.24), we can construct the vector D from the 
given vector E at each point of a dielectric in the coordinate sys- 
tem where the coefficients e,g are known. The coordinate system, 
however, is chosen randomly. The rotation of the Cartesian coor- 
dinate system changes the vector components without varying the 
vectors themselves. The question is how the coefficients e,g, must 
change to maintain the relationship D’ = e,,E,m, in the new sys- 
tem, with D = D’. This means that there must be two different 
expansions of the same vector: 


D=e,,E,m, = e,,E,m,. (A.1.25) 


The vector components and unit vectors are known to be trans- 
formed according to Eq. (A.L.11): Eg=ag,Ey, ma=daym,. In 
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accordance with these equations the left-hand side of Eq. (A.I.25) 
can be rewritten in the following form (the right-hand side is left 
unchanged): 


Tet ol El as 
2o9%g,Aq,L,m, = e,,F,m,. 


Comparing the coefficients of Exmy in the left-hand and right-hand 
sides, we obtain the transformation law for the coefficients egg: 


ea. FarFa fag: (A.I. 26) 


The comparison of this equation with the coordinate transformation 
law 


Xj, = Ag Xp (A.I. 27) 


shows that each index of egg is transformed according to the law 
corresponding to the coordinate transformation law. Eq. (A.1.26) 
represents the tensor transformation law. The inverse transforma- 
tion obviously takes the form ey, = arpdyalap. 

Here is the general definition of a tensor: if in a given Cartesian 
coordinate system we have nine quantities egg which, under the 
coordinate transformation 1® = agaXg, are transformed according 
to the formulae 


, 
fap — 2 


(A.I. 28) 


/ 
yo" yuee yy 


these quantities form a tensor of the second rank. It is not difficult 
to realize that vectors are transformed as tensors of the first rank. 
The rank of a tensor (or, as it is sometimes called, the valency cf 
a tensor) is defined by the number of its indices. There are two 
such indices in our case. Tensors of a higher rank are almost never 
used in this book. A tensor is defined for the space possessing a 
definite number of dimensions since its transformation law in- 
volves the components of the transformation matrix. We have been 
considering the three-dimensional space, and the Greek letter in- 
dices a, B, y, » were varying from one to three. 

We would like to point out two distinctive characteristics of a 
tensor transformation: 

(1) The transformation law for the coefficients of a linear vector 
function (tensor) is obtained as a condition for the invariant phys- 
ical relationship between vectors. 

(2) Any component of a tensor in the “new” coordinate system 
is a linear combination of all components of this tensor in the “old” 
system. 
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As a useful special case, let us note the transformation of a 
three-dimensional tensor of the second rank 7Tgg whose only non- 
zero component 7\, corresponds to a rotation in a plane. In the 
primed system the following components will differ from zero (see 
Eqs. (A.1.28) and (A.1.10)): 


T’ =a,,a, 7, =a? 7, =T,, cos?6, 
: ALi’ Ap WY Ak ll ; (A.1. 29) 
i= Ay Ayo! y,, =, ,4,.7;, = —T,, sin 8 cos 8. 


We shall come across these equations many times in the future. 
The special theory of relativity deals with the four-dimensional 
(pseudo-Euclidean) space. We have already considered the trans- 
formation laws for 4-vectors in this space. (According to our defi- 
nition, a vector is a tensor of the first rank.) The transformation 
rules for tensors in the 4-space remain actually the same, but the 
number of tensor components increases to sixteen while the sum- 
mation is carried out from 1 to 4: 
Aig m% Ane Aig = mi eA (A.I. 30) 


mv 


A tensor is referred to as symmetric if its components satisfy 
the relation A.p = Ag:. Such a tensor has only ten independent 
components. The energy-momentum-tension tensor of an electro- 
magnetic field may serve as an example of a symmetric tensor. 

A tensor is referred to as antisymmetric if its components satisfy 
the relation A,, = — Ag,. It is clear that the elements of this tensor 
having identical indices (é =) are equal to zero since the only 
quantity which is equal to itself when taken with the opposite sign 
is zero. Thus, an antisymmetric tensor has only six independent 
components (in this connection it is sometimes called a six-veclor 
tensor). The electromagnetic field tensor provides an example of 
an antisymmetric tensor. 

A tensor is referred to as unitary if Aix = di. It is easy to see 
that the tensor 5; retains its form under the Lorentz transforma- 
tion in all reference frames. Indeed, if Am: = 5m, then according 
to Eq. (A.1.30) 


7 —_— — — 7 
Aig = Gime A mt = % pm % p15 mt = tm Mem = Ore 


the last transition is performed in accordance with Eq. (A.I.5). 
For reference we shall write the transformation formulae for 
the tensor of the second rank Tix in the case of the Lorentz trans- 
formation, i.e. the formulae describing the transition from the frame 
K’ to the frame K (the inverse transition formulae are obtained 
by changing the sign of V and by replacing the primed quantities 
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by unprimed ones and vice versa): 


Ty =O, Haye (Ta + Ta) + Ta = 
= {Ti — iB (Tit Ti) — BT} 

T= OT ig + O47 49 =T (Tis - iBT’), 

Ty3= Tig + 4,743 =P (Tis — iBT‘,), 

Tyg HO Og Ty Hye Tay Hey Ty + ity Tay = 
=I {Ty, + iBT), + BT), — Bry}, 

To, = 0,75, + 475, =T (Tz, — iBT;,), 

Tyy=T yy To = Toy 

Toy O47), + O47, = (Ty, + iBT), 

Ty, = 4,73, + 4,,7,,—= I (73, — BT,,), 

Ty=Ty Ty =T yy 

Ty,= 4,73, +4,,7,,=T (Ta4 of iBT;,), 

Ty Hy Uy Thy A yey Tyg Heyl + Stalag = 
=|? (Ta + iBri, + oe iBT’), 

T y= OyQl tg A yg Tyg = TV (Teg + EBT ip), 

Tyg OT ig + OT Gg =P (Tig + (BT i5)s 

Ty HOT Heel + Tes = 
=I {74 + BT + 74,) — BT} 


(A.1. 31) 


The tensor quantities appear more often than it may seem at 
first glance. We shall give some examples of tensors of the second 
rank in the 4-space. The products of the components of two vectors 


Z(c.) and b (be) form a tensor. Indeed, let us compose the expres- 
sion A. = C,be. The transformation formulae for the vector com- 
ponents are known: 

C,=Anlnr Oy == %y,b5- (A.I. 32) 


im” m? 


Consequently, 


= —_" 4 7; f 
A ig = COp = him pt m0, = Fim mis 


ik 


coinciding with the tensor transformation law (A.1.30). 
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Let us demonstrate that the derivative of a vector component 
with respect to a coordinate | is transformed as a tensor component. 


Let us consider the vector 6 (bi) and its component derivatives. It 
will be easier for us to write out two formulae: 6;=ajmbm, xj = 


Ox 
=4;Xx, Whence ox, =H to be used in the following chain of 
equalities: 


; / 


; Ox, Ob mn 
— = SF oe (20m) Ox, = 2 2) Ox, 7 (A.I. 33) 


The first and the last links of Eq. (A.1.33) show that the deriva- 
tive 0b,/Ox, is transformed according to the tensor component 
transformation law. 

From Eq. (A.1.33) it is also seen that the vector derivative can 
be transformed in succession. First, we can pass from bm to &; ac- 
cording to the formulae 6; =aimbm. Then from differentiating with 
respect to x; we can proceed to differentiating with respect to xz. 
Of course, the tensor character of the transformation is retained in 
this process although in a rather concealed form That is exactly 
how they sometimes do when trying to avoid tensors in transfor- 
mations of an electromagnetic field 

§ 4. The invariance of a 4-divergence and d’Alembert’s operator. 
Let us demonstrate the invariance of a four-dimensional diver- 
gence and d’Alembert’s operator under the Lorentz transformation. 
We shall write out the necessary formulae in a convenient form: 

ax; 


sf — 
X= Fy Xp Ox, ay 


and for the components of the vector 6: 


bi = 0,,b,, 5, =4,,5). 
First, let us prove that a 4-gradient is transformed as a vector. 
We shall consider the function p=@ (xi, x2, x3, x4) or, briefly, 
o = (xi). Let the coordinate transformation be defined by the 
formulae xj = xj (x), X2, X3, x4). Then according to the differentia- 
tion rule for composite functions 


In this way we obtain the vector component transformation law 
(A.1.32) again. 
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Now let us demonstrate the invariance of the 4-divergence. The 
following chain of equalities setae the fact: 





> OA, OA, Ox; , 0A; 
Woy a og tt Oy? Sr (Aitgs) = Oy St ox, 
Za dA; 
= 67 7° 
Ox; Ox, 


Here we have taken into account that in accordance with Eq. 
(A.1.15) eee: = bu. 
D’Alembert’s operator applied to the function ®, 
_ 130 9 |, a , FO 1 a 
O00=10—-24 97 = oF Ox? Birr oy? a dz? oc? Ot? 
will be written in the form 
7m 80 , 0 , &® a0 
Ox? et x? af: ox? + ax? 8 (ict)? , 

















The expression 0°@/dx?, however, is the divergence of a gra- 
dient. Indeed, let A, = O@/Ox,, then 


% 0 (90 ao 
div A= — | — )= 5 
Ox, ( Ox, ) x? 





But the divergence of a 4-vector is an invariant of the Lorentz 
transformation, and consequently 


7m =F 
x? ox,’ ; 








(A.I. 34) 


Any component of a 4-vector ®(M:) can be taken in the capac- 
ity of the function ®. Let us assume the respective components 


of the {wo 4-vectors ®(O:) and 3 (Sk) to be related in the frame K 
by the following equation: 


a0, 
00,= = — [S,. (A.I. 35) 


ox? 
Then it follows from Eq. (A.1.34) that the relation 





/ oO, 
oO O,= a= = — PoSz (A.1. 36) 
Ox; 
is valid provided Eq. (A.I.35) is. Multiplying the left-hand and 
right-hand sides of Eq. (A.1.36) by the constant factor axm and 
summing over k, we immediately get 


0/0’ =—4p,9’,, (A.L. 37) 
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since GgmD,y=@Dn. The intercomparison of Eqs. (A.1.35) and 
(A.1.37) shows that in the frame K’ we have exactly the same 
equation as in K in which the primed and unprimed quantities are 
interchanged. 

§ 5. The convolution (“rejuvenation”) of tensor indices. In the 
tensor calculus there is an operation causing the rank of a tensor 
to become lower. In the case of a tensor of the second rank this 
operation consists in summing up the tensor components possessing 
two identical indices. It is noteworthy that such an operation gives 
rise to the invariant expression. In the case of tensors of a higher 
rank the convolution results in the tensor’s rank getting lower by 
two. 

We can demonstrate this property very easily. Let us write the 
tensor component transformation formula 


Aig = me Amt 
and sum up the components A,e possessing the identical indices 
having put i = &. Then due to Eq. (A.1.15) 


Aj = Ay + Ao + Ags + Aa = im * Ant = St Amt = 7, 


Thus, 
Ay + An + Asa + Aus = Afi + Ae + Ais + Aly. (A.1. 38) 


Although tensors of higher ranks are hardly used in this book, 
we shall sometimes deal with the results of their convolution. We 
have seen that the differentiation of a scalar function, i.e. an in- 
variant expression, leads to the formation of the vector gradient 
OM/dx,. The differentiation of a vector leads to the formation of a 
tensor of second rank 0?@/dx,dxy. We have already seen that the 
convolution of this expression results in the invariant (A.1.34): 

FO _ dO 
ax? ax? ‘ 





If a tensor of the second rank has the components depending on 
coordinates, the differentiation of these components leads to the 
formation of a tensor of the third rank. For example, from the 
tensor fe we get the tensor of the third rank Ofix/Ox:. 

Let us perform the convolution of this tensor over the indices k 
and / and see that it results in four quantities forming the compo- 
nents of a vector. Indeed, 


7 ta 
lead a, i) am =a a, a, Asp 
OX, Ox in is hp’ SP! Ax, BSR NEY OX mn 


af, ar 
= Gillym = Oe Gr (A.I. 39) 
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Since the indices k, m, s are mute, we see that the quantities 
Of .n/Ox, are transformed according to the vector transformation 
law (A.1.32). 

Also let us consider now how to obtain an invariant expression 
from the components of tensors of the second rank. If the product 
of components of two vectors forms, as we have seen, a tensor of 
the second rank, it can readily be shown that the product of com- 
ponents of two tensors of the second rank yields a tensor of the 
fourth rank. Let the components of the tensor § be denoted by Fir 
and those of the tensor f by fim. Their product Tirim = Fisfim is the 
tensor of the fourth rank. Let us perform the convolution of this 
tensor over the indices ¢ and / as well as over & and m, i.e. let us 
compose the expression 


Trie= F iph iz, (A.I. 40) 


which represents the sum of the pairwise products of the respective 
components. We shall make sure that this expression does not 
change on transition from one reference frame to another. This 
resull is easy to prove since the transformation rule for the com- 
ponents F,x. and fim is known: 


3 ’ t 
F seh tm = Usb pF spt e mal rf" 


Putting i = 1, k = m, we shall get 
F sph i = si Mapes Sol re = Ops O pe spl re = FP sofsp (A 41) 


rs’ pt" sp 
The equality (A.I.41) provides the evidence of the invariance of 
(A.1.40). Of course, the invariance of Fie or fix represents a 
special case of (A.1.40). 

Since we have repeatedly made use of the Gauss-Ostrogradskv 
theorem to treat the vectors resulting from the three-dimensional 
convolution of a tensor, we shall wrile out the requisite formulae. 
In the three-dimensional space this theorem has a bearing on the 
transformation of a vector flow across the closed surface S to the 
integral of the volume 7 enveloped by that surface, e.g., 


$ Das = \ div DdY. (A.1. 42) 
Ss 


ra 


The same theorem takes the following form in the symmetric no- 
tation: 


§ Datta dS = \ oe dv, ; (A.1. 43) 
Ss vr 
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where n, are the components of the normal to the surface ele- 
ment dS. Applying Eq. (A.I.43) to the vector 


OFfap 
As Fre , 
we shall obtain 
of 
\ sitar = § fagtta dS. (A.1.44) 
7 


§ 6. Some data on determinants. The dual tensors. 1. Let us 
arrange n? elements, denoted by the symbol as, where i, & take 
on all values from 1 to n, in the form of a square table. Let the 
first index i in the symbol a,, denote the number of the row and the 
second index & the number of the column of an element. Thus, we 
shall obtain the square matrix formed by the elements aye: 


Qi, Aye Qin 

Qo, Ag Gon 
air aN eT Ota 6 

Ant ano ann 


Qi, a2 Qin 
a a 
lay |-2 SS ant, 
Qni Ano Gan 


which implies a definite operation performed over the elements ae, 
that is the formation of the sum of n! elements a,x. This sum can 
be obtained as follows. Take the product of elements oling 
the elements from different rows, e.g. the rows 1 2... 2: 


igQ2, +++ Mav (A.1.45)} 


or the product of elements involving the elements from different 
columns, e.g., 12... 


Aq 12g2 eee Any (A.1.46) 


where the values of the indices a, B, ..., 1 will be now defined. To 
obtain the value of the determinant, let us compose the algebraic 
sum of the terms (A.I.45) or (A.1.46) differing from one another 
by the indices a, f, ..., + forming a certain permutation of the nat- 


13-97 
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ural sequence of numbers 12... 1n each term of the sum This 
means that the indices a, B, ..., t have different values in each 
term of the sum. The sum is taken over all the permutations of 
numbers | 2... n, the total number of such permutations being 
equal to al. 

The sign “++” or “—” is ascribed to each term of the sum de- 


pending on whether an even or an odd number of pairwise permu- 
tations (transpositions) of elements is needed to obtain a given 
arrangement @ B ... t from the natural sequence of numbers 
12... 2. The pairwise transposition consists, for example, in the 
transition from the sequence | 2 3 4 to the sequence ! 3 2 4 in- 
volving the transposition of the digits 2 and 3. The number of 
transpositions needed to accomplish the transition from the nat- 
ural sequence to a given one is denoted by the letter r. So, ac- 
cording to the definition, a determinant of the nth order is written 
in full as follows: 


ay, aye Qn oT eee 
a a a oa r 
Dy =| ain |= | OR “2 oe Y (—!) Agidge ain 
Ant Ane Gna 
Amp . mt 


= Y (—1)' @ig@eg --- Gay —(A.1.47) 


where the sum is taken over all the permutations of indices 
a B... +t taking on various values from ! to a. The two last rows 
of the equality indicate one of the basic properties of a determinant, 
that is the equivalence of rows and columns. Of course, in expres- 
sions of the forms (A.I.45) and (A.1.46) it is not obligatory to 
take respectively the first or second indices arranged in the natural 
sequence. But then the bringing of specific elements to the canoni- 
cal form (A.1.45) or (A.I.46) would require the indices a, B,..., 1 
to be re-denoted. This new notation would be reduced to the trans- 
position of the indices; in the determinant itself this would mean 
the transposition of rows or columns. Hence it is clear that the 
transposition of an odd number of rows (or columns) alters the 
sign of the determinant while the transposition of an even number 
of rows (or columns) leaves the value of the determinant un- 


changed. Having fixed a definite value a of indices i, k, ..., s, we 
can compose the sum of the terms of transposed indices @B.. t, 
each term being accompanied with the requisite sign, i.e. 

Om@pwe. mt 


Y (—1)' Quidge os. Aes, 
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whose value will be equal to +D, or —D, depending on what 
number of transpositions, odd or even, is ae to obtain the se- 
quence ik ...s from the natural sequence 1 2. 

This is how the determinant of the third aie ‘Ds is written in 
full: 


Qn 412 ar am Bay 
r 
D, =/Aq, An Ay) >= y (—1) ArglrpQxy = 
a3, A232 a33 


= Ay1Ag9g3 HF Ay: G39 + 212823031 — Ay30 394) — Ay 491839 — A1921A3y. 
wet. Bez ye?) (312s 1231" gon 32 12131 


2. The determinant calculation technique. Let us pick from the 
sum (A.1.47) all the terms containing a certain element a,x, group 
them together and take this element as a common factor for this 
combination of terms. The coefficient of the element a,x thus ob- 
tained will be denoted by A,, and referred to as an adjunct or co- 
factor of the element a,x. The co-factor of a given element is cal- 
culated in accordance with a simple rule. In the determinant D, 
we cross out the row and the column which contain the element 
Giz, Whose co-factor A, is sought for. Having crossed out the ith 
row and the &th column, we obtain the determinant D,-, of the 
(n —1)th rank which is referred to as a minor Ai. of the ele- 
ment aie: 


Qi-tt Bint. 6+ Arete kat Bint ett oe+ Ajeia 


Qtett Arete e¢+ Bint eat Bteiaei oor Atin 


eo © © © © © @¢ © 8© © © © 28 © © @ © © 8 © 


The co-factor Aj, may differ from the minor A.x only in sign: 
An= (—1)'** Air. 


Each element has its own corresponding co-factor, but a given 
element does not enter every term of the sum (A.I.47). We can, 
however, select a definite number of elements of a determinant 
which together with their co-factors permit the value of the deter- 
minant to be found. In fact, there is a theorem stating that the 
determinant can be expanded into the elements of any row or any 
column as follows: 


D,= > Gar Agr = > any Arp, 
a=| bal 


43° 


380 Appendix | 


the summation over & is not carried out here while & itself may 
take on any value from | to n. If we compose the sum of the pro- 
ducts of the elements of any row (or any column) and the minors 
of another row (or another column), it will be equal to zero: 


>» QaeAai =0 (a a ). 
The last two formulae are combined into one: 
n 
2X, GarAat = Dore. 


For example, let us calculate the determinant of the Lorentz 
transformation matrix, having expanded it into the elements of the 
first row: 


Gy Bq 23 Oy 0 0 iBr 
pt |% 0% 03 Me) 0 10 eis 
G3, Gyq Ogg Ay 0 o1 0 
Gy, Gyn O43 Cay —ibroo Pr 
100 0 10 
=1'}0 1 O/—iBry O O 1lj/=P—Br?=—f*(1—B%)—1], 
oor —iBr 0 0 


The reader can easily ascertain that the multiplication of the 
elements of the first row by the co-factors of the elements of other 
rows yields zero. 

3. Let us introduce a fully antisymmetric unit tensor of the 
nth rank. This is the tensor 59g...1 whose components alter sign 
when any two indices are interchanged and all components differ- 
ing from zero are equal either to +1 or to —1. Any component of 
an antisymmetric tensor 5gg ..1 whose two indices are equal turns 
into zero (the transposition of two such indices alters the compo- 
nent’s sign due to the antisymmetry of the tensor, but at the same 
time we get the same component; thus, only zero can be equal to 
itself when taken with an opposite sign). Thus, only those compo- 
nents of the tensor 5,8 ..+ differ from zero whose all indices are 
different. Suppose 6:2 .», = 1; then all the components differing 
from zero are equal to +1 if the arrangement aB...t is obtained 
from the sequence | 2... through an even number of transposi- 
tions. If the number of such transpositions in the arrangement 
aB ..t is odd, the component 6gg8...1 is equal to —1. Making use 
of a fully antisymmetric unit tensor, we can rewrite the expres- 
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sion for the determinant D, as follows: 


at ta’ 


D, = bas aie F1q%, coe a = bus Sia 12 q1%p0 soe G 
where the summation over the pairs of indices aB ... t is implied 
this time. 

In particular, we can write the determinant corresponding to the 
Lorentz matrix: 


D* = bay @iaFog%sy2ay = |. 


4. Now we shall deal with the 4-space of the STR. First of all, 
it should be pointed out that we have defined a fully antisymmetric 
unit tensor dpyo, without proving it to be a tensor. We must 
make sure that the components of this tensor have the same values 
in ajJl IFRs, ie. under the Lorentz transformation. This can be 
done quite easily. In accordance with the tensor component trans- 
icrmation rule 

‘ 
= Btetm = UBF ey % pF mp yoy 

But according to what was said in item 1 of this section the 
guantity on the right-hand side is equal to D*8,im, i.e. is equal to 
+1 depending on the number of transpositions needed to obtain 
the arrangement ik/m from the natural one. This means that dpyou 
has the same components in any IFR. The components of this 
tensor also do not change on transition from the left system of 
coordinates to the right one (i.e. when one or three spatial coordi- 
nates change sign). According to Eq. (A.I.30) the components of 
this tensor had to alter sign in this case. Consequently, dpypy is not 
a tensor but a pseudotensor; its components behave in a different 
‘way, as compared to tensors, when a coordinate alters sign (is 
reflected). In all other transformations the behaviour of these com- 

onents coincides with that of the components of a tensor. 

5. The cross and the mixed product of vectors in a three-dimen- 
sional space. These questions are treated in order to have a good 
analogy when considering some of the quantities in the 4-space of 
the STR. 

Consider three unit vectors m), mo, m3; of the orthogonal 
Cartesian system of coordinates. Compose the cross product of any 
pair of these vectors [m,mg]; as a result, the third vector will be 
obtained with the “plus” or “minus” sign, depending on the order 
of co-factors in the cross product. This cross product can be easily 
written via fully antisymmetric unit tensor of the third rank: 


[m,m,] = dagymy. 
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Now we can write the cross product of the vectors a, = a,,m, 
and @, = aogmge: 


[a2] = [aigittg, AogMs| = AqAog [Matg] = SaqyAiaArqMy. (A.1.48) 


It is seen from Eq. (A.1.48) that m, has the coefficients formed 
by the products of vector components and convoluted with the 
tensor 5,py. Let us rewrite Eq. (A.1.48): 


| 
[a,a,} = apytialpmy = > (Bagy@ia@e3 + Spay%1622q) mn, = 


— $ Sapy (210923 — @eafis) my = + SapyCaymy. 

In the third link of this equality a second term is added which 
is equal to the first term with the mute indices a and ® inter- 
changed. Here in the third link we take into account that dg.) = 
= — dapy. In the fourth link é,a, is taken out of the parentheses. 
The antisymmetric tensor, thus formed in the parentheses, is de- 
noted by Cup = @1g42g — Gip@2q Consequently, the cross product 
[a,a2] is a vector whose components are obtained from the anti- 
symmetric tensor C,g according to the formulae 


1 
Cy =F SagyCag: 


The vector C(C,) is said to be dual to the antisymmetric tensor 
Cap. This means that the vector C is orthogonal to two vectors a; 
and a, defining a two-dimensional plane. The orthogonality can be 
proved analytically at once: 


| 1 
Ca, = Ca, = z SagyCagtiy = a Sapy {2ia@eg21y — 2ipQoqAiy} = 0. 


This expression is equal to zero since both the minuend and the 
subtrahend are the determinants with two equal rows and such 
determinants are equal to zero. In much the same way it is proved 
that Ca. = 0. In geometrical terms the norm of the vector C is 
ae to the area of a parallelogram constructed on the vectors a; 
and @po. 

The mixed product of three vectors a, @:, a3 is denoted by 
(@;, @2, dg) and defined as follows: 


(a,, G2, @3) = a; [a,a3] = 21,5 agy22a23gMy = Sapy inden Qn My = 
Gy, G2 3 
= 1,9,y2oqSapy = bagy@1y420438 = Dy==] 21 Ar aaa] 
a3, G2 ax 
Here &jy is the Kronecker delta (A.1.4). 


Appendix | 383 


EI 





In geometrical terms the mixed product of three vectors defines 
the volume of the parallelepiped constructed on these vectors. This 
volume is obtained with the “+” or “—” sign depending on the 
order of the vectors a), a2, @3 in the mixed product. 

6. The dual tensors. Let two 4-vectors a, and ae be given in the 
4-space. Then the projections of the parallelogram area on the co- 
ordinate planes (x,, x) are defined by the antisymmetric tensor 
Ese == Q,2e — 20x. In the 4-space each area element &. can be 
brought into correspondence with another normal area element €), 
such that all the straight lines lying in it are normal! to all the 
straight lines of the initial area element. If the element £}, ortho- 
gonal to &. has the same area as E,, the element &), is called dual 


with respect to Ee. It can be shown that 
. 1 
Fie = Osetmeim: (A.1.49) 


With the aid of this formula any antisymmetric tensor can be 
brought into correspondence with its dual tensor. The dual tensor 
fi, is, in a certain sense, equivalent to the initial tensor fix. We 
have seen that the second group of Maxwell’s equations takes the 
simpler form when written via the,dual tensor Fj,. The sum of the 
products of the antisymmetric tensor components by their dual 
co-factors yields a pseudoscalar 


| | 
Eh, aan ’ ‘— 
FP ig = ZO tniml oF tm = F tia ec md bared ir tes rst mal tn 
l , l , 
= 7 SarbssbciSanSarcaF rsF in =F OrsinF rsFin = FisF rs. 


In these equalities we made use of the definition (A.I.49), the 
tensor transformation formulae (A.1.30) and the properties of the 
Lorentz matrix coefficients (A.I.15). It can be easily shown that 
F?, = F,,F,, is an invariant: 

FigF te = Go %np Pas tictraP ca = SacdpaP an ca = Fan P ao: 
These two invariants were used in § 6.5. 

§ 7. The stress tensor. The stress tensor is introduced in con- 
tinuum mechanics to characterize the force acting on a volume as 
a whole via the force acting on the surface confining this volume. 
We obtained the expressions of this type studying the forces in an 
electromagnetic field; it is useful to consider this problem in me- 
chanics where the physical essence of phenomena is most obvious. 

If an elastic body is subjected to deformation forces arise in it 
tending to return it to the state of equilibrium. These forces are 
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referred to as the internal stresses and are caused by the forces of 
interaction of the molecules of the body. The distinctive feature of 
these forces is their “small radius of action”; in other words, their 
influence is felt only over microscopic (atomic) distances. There- 
fore, it is clear that when a certain volume ¥ is considered inside a 
body, the forces acting on that volume reduce to the forces acting 
across the surface confining that volume. 

Indeed, let the force F act on a unit of volume of a body. Let us 
single out the volume ¥ within a body and consider the total force 
acting on this volume. If the volume d¥ is subjected to the force 
F dy’, the total force acting on the volume is equal to 


\ Fdy. (A.1.50) 


r 


The various parts of the considered volume interact, but in ac- 
cordance with the law of equal action and reaction the forces of 
interaction cancel out and yield a zero resultant. Thus, the total 
force acting on the volume ¥ arises due to the forces exerted by 
the surrounding parts of the body. But as we have already men- 
tioned, these forces act only across the surface confining the con- 
sidered volume. Therefore, the total force will reduce to a certain 
surface integral. In particular, the Bth component of the force 


| F,ay (A.1.51) 


vr 


must also turn into a surface integral. However, this is possible 
only if Fg can be represented in the form 
OT ap 
a Te (A.1.52) 
where 7,g denotes the components of a tensor (only in this case 
the convolution results in a vector). Then in accordance with Eq. 
(A.1.44) 





| F,av = { 


r rv 


OTe gs. 
i dV = § Tagt, dS. (A.1.53) 
Ss 


Having multiplied both sides of Eq. (A.I.53) by mg and carried 
out the actual summation, we obtain 


oT i 
| Fay =| 52% myd¥ =§ Tognam,dS. (A154) 
r r Ss 


The relation (A.I1.54) shows that the total force acting on the 
volume is reduced to the surface integral. Consequently, the result 


Appendix { 885 





obtained can be formulated as follows: if the force F acting on a 
unit of volume can be represented as 
OT ap 
F== My 
B 
its action on the whole volume can be described as the action of a 
surface force distributed over the surface confining this volume, 


(A.1.55) 


Pn 5 





(a) 


Fig. A.2. (a) The stress exerted on the area dS at the boundary surface of the 
volume in which the stresses are generated; a is the normal to the surface ele- 
ment dS; pa is the force acting on the area whose normal is a. (6) On deriva- 
tion of the condition for the equilibrium of a closed tetrahedron-shaped elemen- 
tary volume. The outward normal is chosen as a normal to the closed tetrahed- 
ron surface. At the faces BOC, AOC and AOB the unit vectors of the normal 
are equal to —?, —j and —k respectively. The areas of the faces BOC, AOC and 


AOB are equal to dS cos (A, x), dS cos(f, y) and dS cos(A, 2), respectively. 


with the surface element dS, whose normal a has the unit vec- 
tor components ng, being subjected to the force 


Tagftalt,. (A.1.56) 


Let us discuss briefly the physical meaning of the stress tensor 
components. Let us return to the volume ¥ within the body expe- 
riencing a deformation. The force acting on the surface element dS 
confining the volume ¥ depends on the value and direction of the 
element dS, i.e. on the direction of the normal a relative to this 
element. Let us denote this force by pz dS, having pointed out that 
iis direction, generally speaking, does not coincide with the direc- 
tion of the normal of the surface element dS (Fig. A.2a). The 
vector pn is the force per unit of area; it depends on the orienta- 
tion of the surface element and is called the stress on the surface 
element dS with the normal a. At each point of the deformed elas- 
tic body any direction nr has its corresponding stress vector pn. 
In each Cartesian reference frame it is possible to determine the 
stresses Px, Py, Pz acting on unit of area elements whose normals 
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coincide with the coordinate axes. We shall demonstrate that the 
stress relating to any area element dS with the given vector vr 
can be expressed through nine components of the vectors px, py, Pz. 
these nine components taken together form the stress tensor 7 gg. 

Let an elastically deformed body be in equilibrium. We shalt 
consider the infinitesimal tetrahedron OABC (Fig. A.26) whose 
inclined side has the area dS. Let the normal a of this side be 
directed at an acute angle to the x axis. Then the area elements 


— 


cut out by the coordinate planes will equal dScos (a, x), 


dS cos (7, y) and dS cos (n, 2). The normals of these area elements 
are oriented in the direction opposite to the unitary coordinate 
axes i, jf, k, so that the side BOC is subjected to the force 


— px dS cos (n, x). The value of px can be taken at any point of 
BOC since this side is infinitesimal. In much the same manner, the 
forces acting on the sides AOC and AOB turn out to be equal to 


—p, dS cos (n, y) and —pz.dS cos (n, 2) respectively. In equi- 
librium the resultant of the forces acting on the tetrahedron is 
equal to zero: 


aN. “= a 
dS [p, — Pp, cos(n, x) — p,cos(n, y) — p,cos(n, z)] ==0, 
whence the sought for stress pa is expressed via px, Py, Pz: 


Pn = Pz 08 (Nt, x) + py cos (n, y) +p, cos(n, 2). — (A.1.57) 


Passing to the symmetric notation in Eq. (A.I.57), we can show 
that we obtained a tensor. Indeed, a is the vector of the normal of 
an arbitrary side with the components a, and therefore, pa = 
= Pin; + Poe + pss = Pah. But po = Papmg where pag are the 
components of the vector p,, and consequently, 

Pr = PapMpha. (A.1.58) 
it is seen from Eq. (A.I.58) that the nine components of the 
vectors p, are transformed as a tensor (cf., e.g., Eq. (A.1.24)). 

§ 8. The rectilinear oblique-angled systeins of coordinates. Un- 
til now we have used the orthogonal rectilinear system of coordi- 
nates, but a transition to rectilinear oblique-angled systems of co- 
ordinates makes it possible to illustrate the features characteristic 
of arbitrary coordinate systems dealt with in this book. 

Let us choose, as before, a set of straight lines, not orthogonal 
this time, as coordinate axes and denote them by x! and x? 
(Fig. A.3a). Then let us mark 4 basis unit vector m, on each 
of these axes x#, 

An arbitrary vector A can be expanded into the non-colinear 
vectors my: 


A= A’m,. (A.1.59) 
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The quantities A are the components of the vector A which are 
obtained by means of the parallel projection of this vector on the 
coordinate axes; according to the definition they are called the 
contravariant components of the vector A. 


The quantities 
A, = Am, (A.1.60) 


are the orthogonal projections of the vector A on the coordinate 
axes and are referred to as the covariant components of the vec- 





Fig. A.3. The definiton of the co- and contravariant coordinates in a rectilinear 
oblique-angled system of coordinates on the plane. 


tor A. Obviously, these definitions can be retained for any number 
of measurements. Designating the scalar product of the basis vec- 


tors 
mm, = Buy (A.1.61) 
we get Zuv = vy and in the case of the rectilinear coordinate axes 
uv = const. The covariant and contravariant coordinates relate 
to the same vector and are interrelated: 

A, = Am, = A*mym, = yA". (A.1.62) 
This equation defines the transition from the contravariant compo- 


nents of a vector to the covariant components. 
Then, let us define the quantities g4¥ by the condition: 


0, vu, 
vp ___ aa Vey 
Bu = a= { iL wea: 4.1.63) 
Now let us construct the expression 
BA, = gg, A’ = OCA’ = A. (A.1.64) 


The last equality defines the transition from the covariant to 
contravariant components. Thus, we have obtained two fundamen: 
tal formulae of transition: 


Ay=8yyA", A’ =g’’A,. (A.1.65) 
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The determinant formed by the quantities giz is denoted by g 
(see § 6 of this Appendix): 


&=| Gil 
Using the formula (p. 380) 


> BarAa: = Oye8 


we immediately obtain 
A Vv 
ea (A.1.66) 
where Ayy is the cofactor of the element gyy. 

It is easily seen that for the orthogonal rectilinear coordinates, 
when mymy = dyy, Suv = Syy and Ay = A+, ie. there is no differ- 
ence between the covariant and contravariant coordinates. That is 
why in the case of the orthogonal Cartesian system of coordinates 
we simply speak of the coordinates of vectors. 

By the definition the scalar product of two vectors A and B is 
the quantity 


AB = (A"m,) (B’m,) = A"B” (m,m,) = g,yA"BY = A,B’. (A.1.67) 


The scalar product of a vector by itself defines the square of the 
vector’s absolute value or the norm of the vector: 


A’ = g,,A"A’ = AYA”. (A.1.68) 


Thus, the norm is the square of the vector’s length. If the norm 
of the vector is equal to unity, the vector is called normed or uni- 
tary. If the norm of any non-zero vector is positive, the space is 
referred to as the proper Euclidean space. 

In particular, the square of the infinitesimal vector dr possessing 
the components dx’ and connecting two infinitely close points of 
space is equal to 

= ds? = g,,, dx dx. (A.1.69) 


Let us change the system of base axes and ,pass to new basis 
vectors my, directed along the rectilinear axes x’# (Fig. A.30). Any 
new vector m, can be expanded into the old basis vectors: 


‘=a (A.1.70) 


nov 
where ay are constant coefficients dependent on the concrete trans- 


formation of oblique-angled axes. The quantity my, does not change 
under any transformation provided Jay|=a #0. Of course, any 


old vector my can be expanded into the new ones; 
Mm, = am. aa) | #0. (A.LL7I) 
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It follows from Eqs. (A.1.70) and (A.I.71) that 
m) =aYm,=a‘am,; m,=ahm,=a>atm,.  (A.1.72) 


From Eq. (A.1.72) the coefficients a4 and a are seen to be re- 
lated by the following expressions: 


ava =6s, aay = 04, (A.1.73). 
where be is defined in accordance with Eq. (A.I. 63). 
The radius vector r drawn from the origin of coordinates to the 


point M (Fig. A.3b) can be written in two forms: 
xm, = x'4m'., (A.1.74) 


Taking into account Eqs. (A.1.70) and (A.I.71), the last equation 
can be rewritten in two forms as well: 


Vy/um’! — yum’ v — yn 
apm =x4m, x*m, = x"aym,, 


(A.1.75) 


whence follow the formulae for the direct and inverse transforma- 
tions of the contravariant coordinates of the vector r: 


xY=aex, xv =aye'h, (A.1.76) 


Here is the definition of a vector: the vector A is the quantily 
whose covariant components are transformed under a transition 
to a new reference frame in the same manner as the basis vec- 
tors m,. The contravariant components of vectors are transformed 
as the contravariant coordinates x“. Let us find the formulae for 
the transformation of the components of the vector A. For cova- 
riant components 


A, = Am, = Agim, = as Ay. (A.1.77) 
The inverse transformation formula takes the form 
A = Am) = Aaym, = avA,,. (A.1.78) 
On the other hand, precisely as for the vector r we can write 
A=A’m, = A™m, (A.1.79} 
whence 
Ava)'m, =A *m,, A’m,=A "aym, (A.1.80) 
and consequently 
A* mata’, A= ava. (A.1.81) 


We see that the transformation formulae for the covariant and 
contravariant components of a vector are different. 
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Let us write the transformation law for the quantity (A.1.61). 


Say = (mim) = atm,aom, = arate, ——_(Acl.82) 


According to the definition, this is the transformation law for a 
covariant tensor. The quantity retaining its value when the basis 
vectors (A.I.70) and (A.I.71) change is referred to as an in- 
variant. In the considered case of the rectilinear oblique-angled 
axes ay and a’Y are constant values. Let us demonstrate the in- 


variance of the distance between points: 
ds”? = gi dx dx'¥ =a dx"Yaldx’’g,,=g,,dx°dx? (A.1.83) 


{see Eq. (A.I. 81)). 
The invariance of the operator A= 





oe? | b i 
Ox, ax can also he easily 
verified. 

When, for diverse reasons, vectors are introduced, both covariant 
and contravariant components may be found among the compo- 
nerits of the vectors. We shall quote two important examples. From 


Eq. (A.1.81) it follows that 
dA’ =a’ dA"; (A.1.84) 


whence it is clear that the differentials of the contravariant coor- 
dinates of a vector are transformed as contravariant vectors. 
However, having considered the scalar function of contravariant 
components @(x*) and in particular the components of the vector 
Og/Ox*, we immediately make sure that we deal! with the covariant 
components. 

Indeed, g = 9(x’’) = —p[x’¥(x%)]; the coordinate transformation 
is implied to be known; as usual, we shall write out the formulae 
with “convenient” indices. Thus, in accordance with Eq. (A.1.76) 
x’¥ =a’yx2, whence Ox’"/Ox® =a‘’. In accordance with the for- 
mulae for differentiating a composite function 

" 
iin Menge Bios ey gad (A.1.85) 


ox" ax’Y ax @ Ox 





this is precisely the transformation formula for the components of 
the covariant vector (A. 1.77). Onee again we point out that all 
formulae obtained are valid for the space of any number of di- 
mensions. 

Passing to the 4-space-time of Minkowski, we shall recall that 
the consequence of two Einstein’s postulates is the invariance of the 
quadratic form 

ds? = c? dt? — dx? — dy* — ds? (A.1.86} 
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on transition from one IFR to another, i.e. under the Lorentz 
transformation. The expression (A.I. 86) defines the square of the 
elementary “distance” in 4-space. But the square of the distance 
(A.1.86) is not necessarily positive. In this connection the Eucli- 
dean space determined by the form (A.1.80) is referred to as the 
improper Euclidean or pseudo-Euclidean space. To take advantage 
of the formalism of the proper Euclidean space, we can resort to 
the technique used in this book and consisting in the introduction 
of the imaginary coordinate (cf. Chapter 3). This technique simpli- 
fies the presentation, but at the same time it unintentionally in- 
sinuates the idea of an imaginary nature of relativistic laws them- 
selves which, of course, have nothing to do with the number i 
utilized only for the purpose of making calculations easier. 

Resorting to the real coordinates x® = ct, x! = x, x? = y, x = 2, 
we can write Eq. (A.1.86) as 


ds? = dx® — dx" — dx? — dx*. (A.1.87) 
In any case 
dR = mydx°+ mdx' +m dx?+mdx. —(A.1.88) 


The expression for ds? from Eq. (A.I.87) coincides with that from 
Eq. (A.1.86), provided the following conditions are satisfied: 


m=1, m=m=m=— |; (A.1.89) 
mm,=0 for (, k=0, 1, 2, 3. (A.1.90) 
All these conditions can be expressed by one formula 
i Mp = Sirs (A.1.91) 
where 

I 0 Oo 0 

0 —-! 0 0 
Ge=) 9 9 -] 0 (A.1.92) 

0 0 oO-!I 


Now gx defines the metric tensor of the pseudo-Euclidean space. 
Since Ay = gyvA”, the relationship between the covariant and con- 
travariant vector components follows immediately from this: 


Ay= A°, A,;=— A', Ap=— A?, Ay==— A®.— (A.1.93) 


The scalar product of two vectors and the norm of a vector are 
defined by the following expressions: 


AB = g,,A"BY = A°B? — A'B' — A’B — A°B’, —(A.1.94) 
| A?] = gy yA"AY = (A’)? — (4')? — (4°)? — (4’)?.(A.1.95). 
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It is seen from Eq. (A.1.95) that the norm of an arbitrary non- 
zero real vector is not necessarily positive: it can also be zero or 
negative. This indicates once more that the four-dimensional space 
of the special theory of relativity is pseudo-Euclidean. 

§ 9. The definition of hyperbolic functions and some relation- 
ships between them. For real x the basic definitions are as fol- 
lows: : 


z —z x -x 
x e*—e e e 
sinhyme =". “eoshxa es 
2 2 
sinh x e*—e* 1—e7* 
tanh x = —— = ——_~, = —— - 
cosh x e*+e t+e 


It is seen at once that 
cosh? x — sinh? x = < {(e* + e-*)? — (e* — e-*)*} =4 -4=1, 
Dividing the left-hand and the extreme right-hand sides of the last 
equality by cosh? x, we get 
! 
V/¥—tanh? x ” 





1 — tanh? x = , or coshx= 


cosh? x 
To obtain the formulae 
cosh (x, + x2) = cosh x, cosh x, + sinh x, sinh xo, 
sinh (x, + x.) = sinh x, cosh x» + sinh x, cosh x), 
one needs to substitute into the definitions 
¥1 5%: Zig —% ! — en *tp—F 
cosh (x1 + x)= 22 tee | sinh (x, + xy) = See 
the values following from the definition of hyperbolic functions: 


-0 


e*l.3 == cosh 8, » + sinh®,,., e '?=cosh6, 2— sinh 8, >». 


Finally, here is the very important formula for the real x: 


sinh (x; + %3) 
cosh (x, + x3) 


= sinh x, cosh x3 + cosh x, sinh x; _ _ tanh x, tanh x3 
cosh x, cosh x2 + sinh x, sinhx, !-+ tanh x, tanh x; ° 


tanh (x, + x2) = 


In this book (Chapter 5) we also make use of the expansion 





, x? x? 

ee-* Tees Sieve 2 a oa Sei 

coshx = 2 OOOO = 
2 e 2 


x? 
=Il+ 7+... sinhxex. 
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The relationship between hyperbolic and trigonometric functions 
is determined as follows. In the definitions of trigonometric func- 
tions (the Euler formulae) 
iz_ ,-iz t -1 iz__ ot 
jie=— tace ere. ieee! es 
2 2 i ef® 4 e7!? 
we shall put the imaginary value of z, i.e. assume z = ig. Then it 
is immediately seen that 


sin (ig) =isinhg, cos (ip)=coshg, tan (ig)=itanhg, 


recalling that 2 = —1 and l/i = —1. 
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APPENDIX [1 


a _ THE BASIC FORMULAE 
“OF ELECTRODYNAMICS 
IN THE GAUSSIAN SYSTEM 


1 


In electrodynamics the Gaussian system is used as often as the 
SI system to which we kept in Chapter 6. Surely, the choice of the 
system of units does not affect the essence of the matter, but the 
appearance of formulae varies. For the readers’ convenience we 
shall cite the basic formulae in the Gaussian system. The formulae 

ossess the same numbering under which they are given in 
Chapter 6. 

The equations for the potentials A and @ have the following 

form in vacuo: 


eee 


1 GA 4 
DA=\A-4 ar =— Sh 
vise a9 (6.9) 
oO p=sp—4 t= — 4np. 
The Lorentz condition: 
1 dg 
divA+—5-= (6.8) 
The charge conservation law remains invariable: 
+ divj=0. (6.4) 
The definition of the 4-potential and 4-current: 
O(A, ig), Dlg, A), (6.11) 
> > 
S(j, icp), s(cp, 7). (6.12) 


The relation between the average fields and potentials: 


B=rot A, E=—gradp—-A. (6.25) 
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The tensors of an electromagnetic field in vacuo. 


__ [9x oo, 
Fu=( Ox, <=. on) (6.28) 


0 H, —H, —iE, 
H, 0 4H, —iE, 








Fir — Fir — H, =! H, 0 ps iE, (6.29) 
iE, iE, iE, 0 
The tensors of an electromagnetic field in matter: 
0 H, —H, —iD, 
H, 0 4H, —iD, 
le=\ on oH, 0 iD, |" 
y _ — iD, 
iD, iD, iD, 0 
0 B, —B, —iE€, 
—B, 0 B, —iE, 
Fu= B, a: i 0 _ iE, (6.31) 
1E, JE, iE, 0 
The tensor of the momenta: 
0 M, —M, iP, 
—M, 0 M, iP, 
Ra= (6.33) 


M, —M, O| iP, |! 
—iP, —iP, —iP, 0 
now B= H+ 4nM and E = D — 4nP. 

The field transformation formulae: 
E,= Ey, 3, = By, 
V F , 
E,=!(E+—8:). By =0(B,-TE:), (6.36) 
. . Vin, : Vw 
E.='(E:-28,), Be =P (B+ 28); 
D,=D%, H.,=H;, 
Dy=0 (dD, +2), Hy=C(H,-+D:), (637) 


D=(Di—4H;),  He=0(H+2D;). 
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The same formulae expressed via the projections on the relative 
velocity direction and the direction perpendicular to it: 


E,=5j, E.=0(Ei — [VB’]), Bi= Bi, 

B.=C(Bi+—+ (VE); (6.38) 
D\=Dj, Di=0 (Di—— [VH’]), Hy=Hi, 

Hi=l(Hi+-+[VD)). (6.42) 
The Lorentz force density: 

f=o{£+—[vB}}. (6.49) 
The field invariants (§'6.5): 
h=Fr=E°—H’, Ip=2iEH. 


The four-dimensional expression for the Lorentz force density is 
the same in SI and the Gaussian system: 


fiat Fuse. (6.53) 

The Maxwell equations in the three-dimensional form 
rotH=~j+-D, divD =4n9, (6.56) 
rot H=—+ B, divB=0. (6.57) 


The Maxwell equations in the four-dimensional form: 





Ox, oc OF (6.60) 
OF ik OF gi OFn —_ 
dx, | ox, + ox, ~% (6.67) 


The material equations jp the tbree-dimensional ¢772 C9 not vary: 
% D’ = eE’, (6.68) 

B=’, (6.69) 

j'=oE’. (6.70) 
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The material relations in a moving medium: 
D+ (VH]=e(E++[VB}), (6.74) 
B— —|VE|=»(H—~|VD)}). (6.75) 


Solving these equations with respect to D and B, we shall obtain 
(B = V/c) ; 


D =r { FI — 8) + (en — 1) { [Va] -- VIVE}, 


B= pH (1 —B) —(ep—1){ = (VH]—-4 VVE)}, 


| 
1 — epb? { 
whence 
Dy = ey, By => py, (6.76) 


(1 — epB?) D, = e(1 — BYE, + (ep — 1) IVA, 
(6.77) 
(1 — ep BY) By = p (1 — BY) Ha — (en — 1) > [VEL 


Ignoring the values of B? and epB? as compared to unity in Eq. 
(6.77), we shall get 


D=cE + —(ep—1)(VA], 


1 (6.78) 
B=pH — > (en— 1) [VE]. 
The material equations in the four-dimensional form: 
Pintle = OF pute, (6.79): 
F yyy + Preity + Frypty =e (Frater + Prater + Fitts), (6.80): 
5) = Filty. (6.81) 


The electromagnetic field energy density: 


__ ED+BH 
oS 82 


The Poynting-Umov vector: 


¢ 
S= = [EAl. 
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The energy-momentum-tension tensor in vacuo. 


To, — ice F 
tm=(_ is ss ), g=az=7, |FAl. 
¢ 


| 
Tas = aa {E,D, + H,B,} — Sa, (w). 


The energy-momentum-tension tensors in matter (Minkowski and 


Abraham): 
Tan —icg™ ' 
ce = 


: (Ge: — icg* 
T= 


A s 1 
-!s w . & =F = Gq, (EM), 
¢ 
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